A Data Scientist is responsible for mining data sources to extract subsets of the data that can be used to create useful information based upon statistical or machine learning techniques.
As a Data Scientist, I want to be able to apply mathematical techniques efficiently to large volumes of data. I need to be able to access large stores of structured or unstructured data, clean, validate and separate the data into versioned training and testing data sets.
I need to be able to inspect this data for inherent bias and process it to maintain privacy. I need to be able to train models or run regressions.
As part of the training process, I need to be able to test the performance of my models against a target threshold. I would also like to evaluate the resulting model against our corporate values to ensure fairness and freedom from bias.
I must be able to demonstrate an audit trail of my training activities to regulators to support compliance activities.
Value Add from Continuous Delivery
- Reduced lead times in delivering new capabilities
- Increased deployment frequency
- Reduced risk to the organization
- Path to regulatory compliance
- Standard ways of working