This commit is contained in:
2021-10-29 08:14:23 +01:00
parent c0879efa7f
commit 01b6c2baac
16 changed files with 496 additions and 16 deletions

122
Facebook/aad.md Normal file
View File

@@ -0,0 +1,122 @@
# Abusive Account Detection
# Helpful bunnylol
|||
-|-
aad | aad wiki pages
go aadata | abusive accounts data
fblearner | model trainings are listed here.
orb | inmemory DB
# Data
|||
-|-
ig_signup_sigma_features |
ig_challenge_.... | accounts that were challenged
# proxy metrics
|||
-|-
enrolments | tested users (UFAC)
clear | the lower better (UFAC cleared)
# human labeling
|||
-|-
holdout | signed users, after 12days we label them by humans -> empty, bad, good (bennain) ~5K avccounts
false negarive
MAU prevalense
MIMA prevalence
# Folders
- ## Misc
- `fbcode/dataswarm-pipeline/tasks/si/fake_accounts`
- `fbcode/dper3/dper3_models/si/olf`
- `www/flib/intern/scripts/sigma/clssifiers/olf`
- `configerator/source/sigma/online_classifiers/runtimes`
- all our classifiers are here
- `configerator/source/si/fake_accounts`
- defines active classifiers and the defaults
- `si_sigma/Lib/FakeAccounts`
- sigma rules for the fake accounts, namely new_user_registration is processed here
- ## Models
- `fbcode/fblearner/flow/projects/fluent2/domains/si/aad_surfaces`
- fblearner models
- ## Sentry
- `configerator/source/si/sentry/prod/<namespace>/<category>.cconf`
- configuration of sentries.
- e.g. `namespcae=facebook, category=new_user_registration` defines what is passed to sigma rules. This is used for FB. IG has different config perhaps.
- `configerator/source/si/sentry/si_namespaces.thrift`
- sentry namespaces
- `www/flib/si/sentry/category/SentryCategory.php`
- existing categories
- `www/flib/si/sentry/preparable/filters/sigma/SigmaFilter.php`
- Sigma filter that can be found in sentry configuarations.
- bunnylol orb
- this can be used to query sentry logs
- ## Sigma
- `si_sigma/Endpoint/Sentry/SentryFollowProfile.hs`
- `si_sigma/Contexts/Sentry/SentryFollowProfile.hs`
- scuba sigma_profiling
- to profile
- ## QE
- `configerator/qe2_diff/newExperiments/vahagnk_fast_tiger_clone.txt`
- ## Reg Attack
- `configerator/source/si/reg_attacks/surface_definitions.cinc`
- This here we define surfaces.
- `configerator/source/si/reg_attacks/attack_definitions/`
- Attack definitions.
- `source/si/reg_attacks/reg_attacks.thrift`
- FieldTypes are here.
- ## Piplines
- [dataswarm piplines](https://www.internalfb.com/code/fbsource/fbcode/dataswarm-pipelines/tasks/si/fake_accounts/)
- [online_reg](https://www.internalfb.com/code/fbsource/fbcode/dataswarm-pipelines/tasks/si/fake_accounts/online_reg/)
- ## OLF
- phps OLFAdminV2 status --classifier reg_enthusiastic_impala
- [firefighting](https://www.internalfb.com/intern/wiki/OLF/Firefighting/)
# Model Training workflows
- Train the model
```
flow-cli canary si.olf.ig_signup.train@olf --run-as-secure-group=team_abusive_accounts_detection --entitlement si --parameters-file configs/ig_signup_andromeda_offline.json
```
to monitor progress use [bunnylol fblearner](https://www.internalfb.com/intern/fblearner).
- Publish model
```
aimps publish-model --manifold --oncall aad_surfaces --is-dper-model -d service_sharded <model_id>_<snapshot_id>
```
to monitor progress use [bunnylol predictor](https://www.internalfb.com/intern/predictor).
- Register model
```
phps --www-root /var/www OLFAdminV2 training --action=register_model --classifier-name <model_name> --problem IG_FA_ANDROMEDA --surface IG_SIGNUP --model-id <model_id> --threshold 0.5
```
model name is ig_signup_colorful_animal
- [Compare how model fires.](https://fburl.com/scuba/ig_signup_sigma_features/svhenbuq)
- Create experiment to compare enrolled vs cleared. [bunnylol qe2](https://www.internalfb.com/intern/experiments)
# Team identifiers
- fawg - this is abusive account detection group for diffs
# Tables