Bug description

Accessing Superset pages has been detected by Chrome and Firefox as Dangerous (or Phishing sites)

This is causing browsers to break, and no changes have happened on my end. I highly suspect this has to do with scarf.sh, which is running scripts that the browsers are flagging as problematic...specifically from this URL

https://apachesuperset.gateway.scarf.sh/pixel/

I have a cluster deployed with helm, and using a valid and up-to-date tls certificate from letsencrypt. Unfortunately, my clients are no longer able to access the cluster without overriding browser security recommendations.

I tried a variety of mechanism to disable scarf in my values.yaml including

manually setting:

image:
  repository: apache/superset
configOverrides:
  superset_config.py: |
    SCARF_ANALYTICS = False
    FEATURE_FLAGS = {
      "ENABLE_TELEMERTY": False
    }

Screenshots/recordings

Image Image

Superset version

master / latest-dev

Python version

3.9

Node version

dunno...not sure how to figure this out easily from inside the container

Browser

Brave/Firefox

Additional context

Others are having this problem on different platforms: https://github.com/apache/superset/discussions/31856#discussion-7826104

As an aside, it feels unethical to distribute free software with tracking enabled, especially given no meaningful or straightforward way to ensure consent before enabling/disabling it.

Checklist

  • [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: dosubot[bot]

@Mycobee Hello there! I'm here to assist you with any bugs, questions, or contributions. Let's dive into this issue and get it resolved!

To address the issue of Superset pages being flagged as malicious due to scripts from scarf.sh, you can manually disable Scarf telemetry requests by modifying the source code. Here are the steps you need to follow:

  1. Delete the file: superset-frontend/src/components/TelemetryPixel/TelemetryPixel.test.tsx
  2. Delete the file: superset-frontend/src/components/TelemetryPixel/index.tsx
  3. Modify superset-frontend/src/features/home/RightMenu.tsx by removing the TelemetryPixel component section:
<TelemetryPixel
  version={navbarRight.version_string}
  sha={navbarRight.version_sha}
  build={navbarRight.build_number}
/>

These changes should stop the telemetry requests that might be causing the pages to be flagged [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

Comment From: Mycobee

@dosu this is not helpful because my js is already compiled...

I am currently working on a fix to block scarf on the k8s network level

Comment From: dosubot[bot]

@Mycobee Oh, I see you're diving deep into the network level! I'm a bot designed to help with bugs, technical questions, and engineering knowledge, but it sounds like you're venturing into territory I can't assist with. If you think I should have more tricks up my sleeve, feel free to reach out to the Dosu Team. They'll be happy to hear your thoughts!

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

Comment From: Mycobee

@dosu i see now the requests are all happening from the frontend, so that was a waste of time. How can I disable scarf easily with my helm chart?

Comment From: dosubot[bot]

To disable Scarf using your Helm chart, you can opt-out of the data collection by editing the repository: line in your helm/superset/values.yaml file. Replace apachesuperset.docker.scarf.sh/apache/superset with apache/superset to pull the image directly from Docker Hub. This change will prevent Scarf from collecting telemetry data [1][2].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

Comment From: Mycobee

@dosu that doesnt work. Please provide another option

Comment From: dosubot[bot]

To completely disable Scarf telemetry in Apache Superset, you can set the SCARF_ANALYTICS environment variable to false. This method is effective if modifying the Helm chart does not work. Set this environment variable in your deployment configuration to ensure it is applied where your Superset instance is running, such as in your Docker container or Kubernetes deployment:

export SCARF_ANALYTICS=false

If you are using Docker Compose, you can also edit the docker/.env file to include this environment variable, ensuring it is consistently applied across your deployment [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

Comment From: Mycobee

I am done talking to the robot, but I am all ears if anyone has any ideas how to dynamically set this in the official registry containers with a precompiled frontend

Comment From: rusackas

It's correct, you can simply disable the Scarf gateway with the SCARF_ANALYTICS env var. You'll want to either disable it or allow your networking config to load the (anonymous) tracking pixel and send its telemetry ping when doing npm install. If you search for Scarf in the docs it provides a bit more info about this.

As for why it's being flagged as malicious in the first place, I'm not sure... @arjundevarajan might be interested or have more insights here.

Comment From: Rupantar-Borphukan

Can I try to solve the issue ?

Comment From: Mycobee

It's correct, you can simply disable the Scarf gateway with the SCARF_ANALYTICS env var

I don't mean to split hairs @rusackas, but I think this is incorrect currently for a Kubernetes deployment with the superset Helm chart.

$ grep -i -e env -e scarf superset/values.yaml 
extraEnv:
  SCARF_ANALYTICS: "false"

$ kubectl exec -it -n superset superset-7764c554cf-zz52f -- bash

root@superset-7764c554cf-zz52f:/app# echo $SCARF_ANALYTICS
false

Browsers are still making calls to scarf analytics with these config options

With a docker-compose environment, there is a superset-frontend directory, in a separate container. That container runs the webpack server. But in the production kubernetes container, the assets are precompiled.

Scarf Analytics are disabled by process.env source 1 source 2

JavaScript is not my expertise, but I believe this has something to do with build time, not at run time (at least for source 2). That might explain the precompilation specific problem ??

But in the official helm deployment, we do have access to a webpack process as everything is bundled in to 1 container.

Did you read the thread of the similar issue? I linked it in my bug report: https://github.com/apache/superset/discussions/31856#discussion-7826104

This user had to manually delete files to fix their problem, and it seems they are running docker. So IMO there is enough evidence to look further in to this. Unfortunately, I cannot manually delete files like that user could, since my assets are all precompiled in one big file.

Comment From: Mycobee

@Rupantar-Borphukan , I am not a core contributor to this project. But I do think this issue needs further investigation. There are two publicly reported instances of the bug. And both instances report challenges disabling scarf analytics using the blessed environment variable.

Comment From: TristisOris

We got same issue at some of chrome users. *upd. after few page visits, all browsers show warning now.

SCARF_ANALYTICS=false is not working. removal TelemetryPixel code also don't work. docker installation.

Comment From: TristisOris

../superset/docker/pythonpath_dev/superset_config_docker.py something begins work after add SCARF_ANALYTICS=false to this file. But not everywhere.

Comment From: rajuw892

I'm using import { embedDashboard } from '@superset-ui/embedded-sdk this sdk in frontend react code , i'm also getting dangerous site in chrome browser, can someone help me , how to fix this.

Comment From: aviaviavi

@rusackas a few potential things we can do to fix this:

We now support custom variables on our native Scarf pixels which don't use a redirect. This may help appease Brave and other browsers without a loss of functionality.

If it doesn't, we also now support pixels on custom domains, so we could avoid needing to use a scarf.sh URL. I know that can be more challenging from an ASF perspective but just another option.

Happy to help here in any case!

The Scarf team is also reaching out to Brave and others to have us removed from these lists. While we are helping with privacy-conscious observability, there is absolutely no phishing or other malware associated with these URLs.

Comment From: marienpflaum

The problem here seems to be that the javascript is already compiled in one of the first stages of the docker file. On this point the env is already internalized. changes to the environment later on are therefore not working. A solution could be to define a new ARG in the docker file "SCARF_ANALYTICS" which should be mapped to a ENV parameter. This ARG in my opinion should default be false. But that is another discussion ;). This way at least people who use the docker image as parent can influence the SCARF_ANALYTICS settings. Maybe... even make a docker tag out of it ;)

Comment From: robson83

@Mycobee were you able to find a fix? in my case, i created content security policy to force the script to not load. But it's an ugly hack.