# Abusive Account Detection

# Helpful bunnylol
|||
-|-
aad                             | aad wiki pages
go aadata                       | abusive accounts data
fblearner                       | model trainings are listed here.
orb                             | inmemory DB


# Data
|||
-|-
ig_signup_sigma_features        |
ig_challenge_....               | accounts that were challenged


# proxy metrics
|||
-|-
enrolments  | tested users (UFAC)
clear       | the lower better (UFAC cleared)

# human labeling
|||
-|-
holdout | signed users, after 12days we label them by humans -> empty, bad, good (bennain) ~5K avccounts

false negarive


MAU prevalense
MIMA prevalence

# Folders

- ## Misc
    - `fbcode/dataswarm-pipeline/tasks/si/fake_accounts`
    - `fbcode/dper3/dper3_models/si/olf`
    - `www/flib/intern/scripts/sigma/clssifiers/olf`

    - `configerator/source/sigma/online_classifiers/runtimes`
        - all  our classifiers are here
    - `configerator/source/si/fake_accounts`
        - defines active classifiers and the defaults

    - `si_sigma/Lib/FakeAccounts`
        - sigma rules for the fake accounts, namely new_user_registration is processed here

- ## Models
    - `fbcode/fblearner/flow/projects/fluent2/domains/si/aad_surfaces`
        - fblearner models

- ## Sentry
    - `configerator/source/si/sentry/prod/<namespace>/<category>.cconf`
        - configuration of sentries.
        - e.g. `namespcae=facebook, category=new_user_registration` defines what is passed to sigma rules. This is used for FB. IG has different config perhaps.
    - `configerator/source/si/sentry/si_namespaces.thrift`
        - sentry namespaces
    - `www/flib/si/sentry/category/SentryCategory.php`
        - existing categories
    - `www/flib/si/sentry/preparable/filters/sigma/SigmaFilter.php`
        - Sigma filter that can be found in sentry configuarations.
    - bunnylol orb
        - this can be used to query sentry logs

- ## Sigma
    - `si_sigma/Endpoint/Sentry/SentryFollowProfile.hs`
    - `si_sigma/Contexts/Sentry/SentryFollowProfile.hs`
    - scuba sigma_profiling
        - to profile

- ## QE
    - `configerator/qe2_diff/newExperiments/vahagnk_fast_tiger_clone.txt`

- ## Reg Attack
    - `configerator/source/si/reg_attacks/surface_definitions.cinc`
        - This here we define surfaces.
    - `configerator/source/si/reg_attacks/attack_definitions/`
        - Attack definitions.
    - `source/si/reg_attacks/reg_attacks.thrift`
        - FieldTypes are here.

- ## Piplines
    - [dataswarm piplines](https://www.internalfb.com/code/fbsource/fbcode/dataswarm-pipelines/tasks/si/fake_accounts/)
        - [online_reg](https://www.internalfb.com/code/fbsource/fbcode/dataswarm-pipelines/tasks/si/fake_accounts/online_reg/)

- ## OLF
    - phps OLFAdminV2 status --classifier reg_enthusiastic_impala
    - [firefighting](https://www.internalfb.com/intern/wiki/OLF/Firefighting/)


# Model Training workflows

- Train the model
```
flow-cli canary si.olf.ig_signup.train@olf --run-as-secure-group=team_abusive_accounts_detection --entitlement si --parameters-file configs/ig_signup_andromeda_offline.json
```
to monitor progress use [bunnylol fblearner](https://www.internalfb.com/intern/fblearner).

- Publish model
```
aimps publish-model --manifold --oncall aad_surfaces --is-dper-model -d service_sharded <model_id>_<snapshot_id>
```
to monitor progress use [bunnylol predictor](https://www.internalfb.com/intern/predictor).

- Register model
```
phps --www-root /var/www OLFAdminV2 training --action=register_model --classifier-name <model_name> --problem IG_FA_ANDROMEDA --surface IG_SIGNUP --model-id <model_id> --threshold 0.5
```
model name is ig_signup_colorful_animal

- [Compare how model fires.](https://fburl.com/scuba/ig_signup_sigma_features/svhenbuq)
- Create experiment to compare enrolled vs cleared. [bunnylol qe2](https://www.internalfb.com/intern/experiments)

# Team identifiers
- fawg - this is abusive account detection group for diffs

# Tables