# Configuration File for LM-Debugger++

The default LM-Debugger++ configuration-file looks like this: 

```
server_files_dir: "lm-debugger/server_files/",
model_name: "hf_organization/model_name",
device: "cuda:0",
server_ip: "127.0.0.1",
server_port: 8000,
elastic_ip: "127.0.0.1",
elastic_port: 9200,
react_ip: "127.0.0.1",
react_port: 3000,
streamlit_ip: "127.0.0.1",
streamlit_port: 8501,
top_k_tokens_for_ui: 10,
top_k_for_elastic: 50,
num_layers: 32,
elastic_index: "model_projections_docs",
elastic_projections_path: $.server_files_dir + "my_file.pkl",
elastic_api_key: "VGhlIGNha2UgaXMgYSBsaWU=",

layer_mappings: {
	mlp_sublayer: "model.layers.{}.mlp",
	attn_sublayer: "model.layers.{}.attn",
	mlp_activations: "model.layers.{}.mlp.act_fn",
	mlp_gate_proj: "model.layers.{}.mlp.gate_proj",
	mlp_up_proj: "model.layers.{}.mlp.up_proj",
	mlp_down_proj: "model.layers.{}.mlp.down_proj",
	decoder_input_layernorm: "model.layers.{}.input_layernorm",
	decoder_post_attention_layernorm: "model.layers.{}.post_att_norm",
	post_decoder_norm: "model.norm"
},

sae_paths: [
	"autoencoders/my_sae.pt"
],
autoencoder_device: "cuda:1",
sae_active_coeff: 100,

rome_paths: [
	"lm-debugger/config_files/ROME/codellama.json"
]
```

Explanation of the datafields: 

* `server_files_dir`: Directory for the Server Files to be stored. This includes the forward Projections of the MLP (LMDebuggerIntervention)
* `model_name`: HuggingFace-Name of the Model. The Model-Name must be supported by the used Instance of `TransformerModelWrapper`
  `device`: Device, the Transformer is loaded to (e.g., "cuda", "cuda:0", "cpu")
* `server_ip`: Flask-Backend Server IP
* `server_port`: Flask-Backend Server Port
* `elastic_ip`: ElasticSearch Server IP
* `elastic_port`: ElasticSearch Server Port
* `react_ip`: React-Frontend IP
* `react_port`: React-Frontend Port
* `streamlit_ip`: Streamlit-Frontend IP
* `streamlit_port`: Streamlit-Frontend Port
* `top_k_tokens_for_ui`: Number of Tokens that are shown in the React-Frontend (as most probable Tokens before and after Interventions)
* `top_k_for_elastic`: Number of Tokens, returned by each ElasticSearch-Request
* `num_layers`: Number of Layers of Transformer
* `elastic_index`: Name of ElasticSearch-Index, the MLP-Projections are stored in
* `elastic_projections_path`: Path to the Pickle-File containing the MLP-Projections. This file is generated by the Script 
* `lm_debugger/es_client/create_offline_files.py` and uploaded to the ElasticSearch-Instance by 
* `lm_debugger/es_client/index_value_projections_docs.py`
* `elastic_api_key`: API-Key of the ElasticSearch-Instance
* `layer_mappings`: Mappings of Layers to their Descriptors. Used in Interventions. ("{}" refers to Layer-Index)
* `sae_paths`: List of Pickle-Files of SAE-Models used in the Application. SAEs can be trained using the Scripts included in `sparse_autoencoders`. Please refer to Documentation-Page Training-Scripts SAE
* `autoencoder_device`: Device used for SAE-Calculations (e.g., "cuda", "cuda:0", "cpu")
  `sae_active_coeff`: Coefficient, an active SAE-Intervention's Feature is set to
* `rome_paths`: List of JSON-Files for all ROME-Instances