What is this Page?

The UK government (under Sunak) released a document on how frontier labs can be safer, as well as encouraging them to release how they’re being safe at the moment. The document is here. This page is how they suggested scaling the models safely.

Responsible Scaling:

In order to (in the eyes of the government) be safer, a “Responsible Scaling Framework” for managing risks and deploying AI models.

1. Risk Assessments:

  • Assess risks at all points in time. Analyse factors like expert forecasts (guesses) and model evaluations.
  • Define specific thresholds that, if breached, require specific actions or mitigations.
  • Develop plans for if/when each threshold is exceeded.

2. Model Evaluations and Red Teaming:

  • Evaluate models for dangerous capabilities, lack of control and security weaknesses.
  • Constantly evaluate the models (at every point of AI Training)

3. Information Sharing:

  • Share general risk info:
    • Risk assessment processes, internal governances with government authorities.
  • Share model information (obviously hard to do)
  • Share different information based on audience / stakeholders.

4. Security Controls:

  • The last thing we want is the weights getting stolen. Have good |cybersecurity posture.
  • Do not enable the models to do something dangerous in the first place.

5. Reporting Structure:

  • Incentivise people to report vulnerabilities in AI models
    • Á la bug bounties.

6. Identify AI-Generated Material:

  • Research and implement robust AI Watermarking systems.
  • These should be hard to remove.

7. Prioritise Safety Research:

8. Monitor & Prevent Model Misuse:

  • Detect misuse.
  • Filter harmful prompt
  • Develop rapid response plans.

9. Data Input Controls and Audits:

  • Responsible Data Collection…
  • Remove harmful data from training process.

Interestingly,

There was no call for Governing Compute.