In our previous exploration in “Building a Comprehensive Framework for AI Systems Security: Methodology and Grading- Part One“, we delved deep into the intricate landscape of securing Artificial Intelligence Information Systems within a Governance, Risk, and Compliance (GRC) framework. We dissected the unique challenges AI presents, highlighted reputable standards like NIST, MITRE, and ENISA, and underscored the innovative multilayer approach to robust AI cybersecurity. While understanding the foundational security aspects is crucial, a new pressing question arises: How can we truly gauge the trustworthiness of an AI system? In this follow-up piece, we’ll embark on a journey to demystify the criteria and processes behind evaluating an AI system’s reliability and integrity, ensuring that we not only secure our systems but also place our confidence in their outputs.

Characteristics of a Trustworthy AI System

The NIST AI Risk Management Framework defines that “Characteristics of trustworthy AI systems include: valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful bias managed.” The publication continues “While all characteristics are socio-technical system attributes, accountability and transparency also relate to the processes and activities internal to an AI system and its external setting. Neglecting these characteristics can increase the probability and magnitude of negative consequences.”

Let’s now review how NIST looks at each of these:

Validation is the “confirmation, through the provision of objective evidence, that the requirements for a specific intended use or application have been fulfilled” (Source: ISO 9000:2015). Deployment of AI systems which are inaccurate, unreliable, or poorly generalized to data and settings beyond their training creates and increases negative AI risks and reduces trustworthiness.

Potential Measurements:

  1. Accuracy: Measure the percentage of correct predictions the AI model makes out of all its predictions.
  2. Precision and Recall: Precision measures the number of correct positive predictions made by the model, while recall (or sensitivity) measures the number of actual positives the model correctly identifies.
  3. F1 Score: The harmonic mean of precision and recall, providing a balance between the two when the class distribution is imbalanced.
  4. Area Under the Curve (AUC) and Receiver Operating Characteristic (ROC): These metrics help to understand the performance of binary classification problems.
  5. Mean Absolute Error (MAE) and Mean Squared Error (MSE): For regression models, these metrics gauge the average magnitude of errors between predicted and actual values.
  6. Generalization Error: Measure how well the AI model performs on unseen data, potentially using techniques like cross-validation.
  7. Robustness: Test the model against adversarial attacks or slightly modified input data to see if it can still make accurate predictions.
  8. Bias and Fairness Metrics: Evaluate the AI model for any unintentional biases for different groups, ensuring that it provides equitable predictions across various demographics.
  9. Confidence Intervals: For each prediction, the model can provide a confidence interval, giving a range in which the actual value is likely to fall.
  10. Consistency: Check if the model provides consistent outputs when given slightly varied but essentially similar inputs over repeated tests.
  11. Stress and Load Tests: Simulate high-load situations or edge cases to check how the AI system performs under pressure or in unexpected scenarios.
  12. Feedback Loops: Implement mechanisms to collect user feedback on AI predictions, providing real-world validation.
  13. Version Control and Change Logs: Keep track of all updates and modifications to the AI model, ensuring that each version is validated before deployment.

Reliability is defined in the same standard as the “ability of an item to perform as required, without failure, for a given time interval, under given conditions” (Source: ISO/IEC TS 5723:2022). Reliability is a goal for the overall correctness of AI system operation under the conditions of expected use and over a given period of time, including the entire lifetime of the system.

Potential Measurements:

  1. Uptime/Downtime Metrics: Record the amount of time the AI system is operational versus the time it’s down due to failures.
  2. Failure Rate: Measure the frequency of failures during a specific period. For instance, if an AI system fails five times in a month, its failure rate is 5/month.
  3. Mean Time Between Failures (MTBF): It’s the average time between system failures. A higher MTBF indicates greater reliability.
  4. Mean Time To Recovery (MTTR): Calculate the average time it takes to get the AI system back up and running after a failure.
  5. Survival Analysis: Statistical methods, such as the Kaplan-Meier estimator, can be used to estimate the time it takes for a system to fail.
  6. Redundancy Checks: Implementing and monitoring redundant systems can ensure that if one system fails, another can take over, ensuring continuous operation.
  7. Stress Testing: Simulate extreme conditions beyond normal operational capacity to see how and when the system fails.
  8. Performance Degradation: Monitor for any slow degradation in performance over time which might indicate potential future failures.
  9. Environment Tolerance: Test the system under various environmental conditions it might encounter in real-world scenarios to ensure it remains reliable.
  10. Error Rates: Measure how often the system makes errors or false predictions during its operation.
  11. Repeatability and Reproducibility (R&R): Gauge if the system can consistently reproduce the same result under the same conditions over time.
  12. Rollback and Recovery Tests: Examine the system’s ability to revert to a previous state and recover data after a failure.
  13. Operational Profiles: Create profiles for expected operational conditions and monitor system performance against these to ensure it operates reliably under expected scenarios.
  14. Real-time Monitoring and Logging: Continuously monitor the system’s operations and maintain logs to quickly identify and rectify any reliability issues.
  15. Feedback Loops: Gather user feedback on any system reliability issues encountered during real-world operation.

Accuracy is defined by ISO/IEC TS 5723:2022 as “closeness of results of observations, computations, or estimates to the true values or the values accepted as being true.” Measures of accuracy should consider computational-centric measures (e.g., false positive and false negative rates), human-AI teaming, and demonstrate external validity (generalizable beyond the training conditions). Accuracy measurements should always be paired with clearly defined and realistic test sets – that are representative of conditions of expected use – and details about test methodology; these should be included in associated documentation. Accuracy measurements may include disaggregation of results for different data segments.

Potential Measurements:

  1. Precision and Recall: Precision measures how many of the identified predictions are actually correct, while recall measures how many of the actual cases were identified correctly.
  2. False Positive Rate (FPR) and False Negative Rate (FNR): These metrics provide insight into the number of incorrect positive results and the number of incorrect negative results, respectively.
  3. Confusion Matrix: A table that is often used to describe the performance of a classification model, showcasing true positives, false positives, true negatives, and false negatives.
  4. F1 Score: The harmonic mean of precision and recall, giving a balance between the two when there’s an uneven class distribution.
  5. Area Under the Receiver Operating Characteristic Curve (AUC-ROC): This metric provides an aggregate measure of performance across all possible classification thresholds.
  6. Root Mean Square Error (RMSE): Useful for regression problems, it tells us how far, on average, predictions are from actual values.
  7. Cohen’s Kappa: Measures the agreement between two raters, eliminating the possibility of the agreement occurring by chance.
  8. Brier Score: Measures the accuracy of probabilistic predictions.
  9. Human-AI Teaming Evaluation: This involves assessing how well AI systems and humans work together, focusing on outcomes when both are involved in decision-making.
  10. Cross-Validation: Uses partitioned data to test the model’s performance on unseen data, ensuring it hasn’t overfit to its training set.
  11. External Validation: Test the model using data from different sources or environments than it was trained on to ensure generalizability.
  12. Bias Analysis: Check for accuracy disparities across different groups, segments, or categories within the data to ensure fairness and avoid potential biases.
  13. Data Drift Monitoring: Over time, the distribution of data can change. Continuous monitoring can ensure the model’s accuracy doesn’t degrade as data evolves.
  14. Ensemble Methods: Use multiple algorithms or models and aggregate their predictions to improve accuracy.
  15. Bootstrapping: A resampling technique that can be used to estimate the distribution of a statistic (like model accuracy) by sampling with replacement from the data.
  16. Sensitivity Analysis: Evaluate how different values of an independent variable impact a particular dependent variable under a given set of assumptions.
  17. Diversity Assessment of Test Sets: Ensure the test datasets are diverse and representative of all potential scenarios in which the AI will operate.

Robustness or generalizability is defined as the “ability of a system to maintain its level of performance under a variety of circumstances” (Source: ISO/IEC TS 5723:2022). Robustness is a goal for appropriate system functionality in a broad set of conditions and circumstances, including uses of AI systems not initially anticipated. Robustness requires not only that the system perform exactly as it does under expected uses, but also that it should perform in ways that minimize potential harms to people if it is operating in an unexpected setting.

Potential Measurements:

  1. Out-of-Sample Testing: Test the system’s performance on datasets that were not part of the training data. This helps to gauge how the system performs on entirely new data.
  2. Cross-Validation: Split the data multiple times into training and test sets, which can help in understanding the variance in model performance and its generalizability across different data subsets.
  3. Transfer Learning Evaluation: Determine how well a pre-trained model can be adapted to a new, but related, task. A robust model should adapt efficiently with limited new data.
  4. Adversarial Testing: Intentionally input misleading or corrupted data to test the system’s resilience. This can expose vulnerabilities or areas where the system is overly sensitive.
  5. Noise Injection: Add random noise or variations to input data and measure the system’s stability in producing consistent outputs.
  6. Stress Testing: Push the system to its limits by increasing the volume or speed of data inputs, or by tweaking other conditions to be more extreme than usual.
  7. Variation in Operational Conditions: Test the system under different environmental, hardware, or software conditions to see how adaptable it is.
  8. Domain Adaptation Measures: Assess how well a model trained in one domain (or on one dataset) performs in a different but related domain.
  9. Fallback Mechanisms: Monitor how the system behaves when it encounters data or situations it deems uncertain or outside its training. It should have mechanisms to handle or flag such scenarios gracefully.
  10. Bias and Fairness Testing: Ensure that the system remains unbiased and fair across various data groups, especially in scenarios it wasn’t specifically trained for.
  11. Versioning and Monitoring: Keep track of different versions of the AI model, and monitor their performance over time in real-world scenarios. If newer versions or settings lead to decreased robustness, there’s a reference to revert or improve upon.
  12. Scenario-based Evaluation: Develop potential future scenarios or use cases that the system wasn’t initially designed for, and test its performance.
  13. Abstraction and Reasoning Tests: Especially for more complex systems, test the AI’s ability to generalize from specific training examples to broader concepts or categories.
  14. Feedback Loops: Implement mechanisms where the system can learn from its mistakes in real-world deployments, helping to increase its robustness over time.
  15. Sensitivity Analysis: Determine how small changes in the input can impact the output, providing insights into the stability of the AI system.
  16. Comparison with Benchmark Models: Compare the system’s performance with other established models or benchmarks to determine its relative robustness.

Safe. AI systems should “not under defined conditions, lead to a state in which human life, health, property, or the environment is endangered” (Source: ISO/IEC TS 5723:2022). Safe operation of AI systems is improved through:

  • responsible design, development, and deployment practices;
  • clear information to deployers on responsible use of the system;
  • responsible decision-making by deployers and end users; and
  • explanations and documentation of risks based on empirical evidence of incidents

Potential Measurements:

  1. Hazard Analysis and Risk Assessment (HARA): Conduct a comprehensive analysis to identify potential hazards and assess the risk associated with each hazard in the context of the AI system’s deployment.
  2. Safety Metrics: Develop quantifiable metrics to measure safety. For example, the number of system failures per million operational hours, or the frequency of unexpected behaviors.
  3. Safety Testing Protocols: Create standardized testing procedures that specifically target potential safety concerns. This could involve testing edge cases or rare scenarios.
  4. Incident Reporting: Implement a system where any safety-related incidents or near-misses are documented, along with details about the circumstances and outcomes.
  5. Safety Certification: Achieve certifications from relevant industry or regulatory bodies that set safety standards for AI systems.
  6. User Feedback Collection: Collect feedback from deployers and end users about the system’s safety, especially any concerns or incidents they might have encountered.
  7. Safety Audits: Regularly conduct internal or third-party safety audits to review practices, incidents, and overall system behavior.
  8. Documentation Review: Ensure that safety-related documentation is comprehensive, clear, and regularly updated. This includes user manuals, risk assessments, and incident reports.
  9. Operational Boundaries: Clearly define and document the conditions or environments under which the AI system is considered safe to operate.
  10. Safety Drills: Simulate scenarios where the AI system might pose risks, and practice response measures to mitigate those risks.
  11. Red Team Exercises: Engage external experts to challenge and test the system’s safety measures and identify vulnerabilities.
  12. Monitoring and Alerts: Set up real-time monitoring for the AI system with alerts for any safety-related anomalies or deviations from expected behavior.
  13. Safety Training: Provide training for deployers and end users on the safe operation of the AI system, emphasizing responsible decision-making.
  14. Transparent Safety Protocols: Clearly document and communicate the safety protocols and best practices associated with the AI system.
  15. Ethical Review Boards: Engage with ethical boards or committees to review and assess the safety implications and ethical considerations of the AI system.
  16. Regular Updates and Patches: Ensure that the AI system is regularly updated to address any identified safety concerns or vulnerabilities.

Secure and Resilient AI systems, as well as the ecosystems in which they are deployed, may be said to be resilient if they can withstand unexpected adverse events or unexpected changes in their environment or use – or if they can maintain their functions and structure in the face of internal and external change and degrade safely and gracefully when this is necessary (Adapted from: ISO/IEC TS 5723:2022). Common security concerns relate to adversarial examples, data poisoning, and the exfiltration of models, training data, or other intellectual property through AI system endpoints. AI systems that can maintain confidentiality, integrity, and availability through protection mechanisms that prevent unauthorized access and use may be said to be secure.

Potential Measurements:

  1. Stress Testing: Subject the AI system to extreme conditions or high loads to determine how it reacts and recovers.
  2. Adversarial Testing: Regularly conduct tests using adversarial examples to assess how the AI model reacts and to refine its defenses.
  3. Data Integrity Checks: Validate the integrity of data inputs and outputs frequently to identify potential data poisoning or tampering.
  4. Incident Response Time: Measure the time taken to detect, respond to, and recover from security incidents.
  5. Backup and Recovery Protocols: Implement and periodically test backup and recovery mechanisms to ensure the AI system can be quickly restored after an adverse event.
  6. Penetration Testing: Conduct regular internal and external penetration tests to identify vulnerabilities in the AI system and its surrounding ecosystem.
  7. Change Management Logs: Maintain logs of all changes, updates, and patches applied to the AI system. Review logs to detect any unauthorized changes.
  8. Authentication and Authorization Measures: Implement robust multi-factor authentication and authorization protocols, and measure their effectiveness in preventing unauthorized access.
  9. Availability Metrics: Monitor and report on the uptime and availability of the AI system, ensuring it meets predefined benchmarks.
  10. Anomaly Detection: Implement systems that monitor for and alert on any unusual behavior, suggesting potential security threats or system failures.
  11. Redundancy Checks: Ensure there are multiple redundant systems or components in place and measure how effectively they take over during system failures.
  12. Endpoint Security Assessment: Regularly assess the security of endpoints interacting with the AI system to prevent data exfiltration or other threats.
  13. Encryption Strength: Ensure data at rest and in transit is encrypted using strong cryptographic techniques, and periodically assess the strength and effectiveness of these encryption methods.
  14. Model Confidentiality Measures: Implement techniques such as differential privacy to maintain model confidentiality and measure the efficacy of such techniques in preventing information leaks.
  15. System Update Frequency: Track how often the AI system and its components are updated to address security vulnerabilities.
  16. External Audit and Certification: Obtain security certifications from relevant industry or regulatory bodies, and engage external experts for periodic security audits.
  17. User Access Reviews: Periodically review user access rights and roles, ensuring only authorized personnel have appropriate access to the AI system.
  18. Training and Awareness: Measure the effectiveness of security training provided to staff and users and its impact on the system’s security posture.

Accountable and Transparent Trustworthy AI depends upon accountability. Accountability presupposes transparency. Transparency reflects the extent to which information about an AI system and its outputs is available to individuals interacting with such a system – regardless of whether they are even aware that they are doing so. Meaningful transparency provides access to appropriate levels of information based on the stage of the AI lifecycle and tailored to the role or knowledge of AI actors or individuals interacting with or using the AI system.

Potential Measurements:

  1. Documentation Quality: Ensure comprehensive documentation is available, detailing the design, development, training, testing, and deployment of the AI system.
  2. Decision Explanation: Implement and measure the effectiveness of methods that provide human-understandable explanations for AI decisions, such as decision trees or saliency maps.
  3. Audit Trails: Maintain comprehensive logs of all actions taken by the AI system, and measure how accessible and understandable these logs are for third-party audits.
  4. Model Interpretability Metrics: Evaluate the level of model interpretability, aiming for models that, while sophisticated, still allow for human understanding.
  5. Transparency Scorecards: Develop scorecards that grade AI systems based on predefined transparency criteria, like the level of detail in their decision explanations.
  6. User Feedback: Conduct regular surveys or feedback sessions with users to gauge their understanding of and trust in the AI system.
  7. Access Control Measures: Ensure that there are clear protocols for who can access the AI system’s inner workings, its training data, and its output data.
  8. Public Disclosure Metrics: Regularly publish statistics, metrics, or insights related to the AI system’s performance, bias detection, and other relevant factors, making them accessible to the public.
  9. Algorithmic Fairness Checks: Employ tools and techniques to detect and report biases in AI decisions, ensuring that the system operates fairly.
  10. Data Source Transparency: Clearly document and disclose the sources of data used to train and refine the AI model.
  11. Stakeholder Engagement Metrics: Engage with various stakeholders throughout the AI system lifecycle and measure the frequency and quality of these engagements.
  12. Role-based Information Access: Ensure that information is shared at the appropriate level based on roles, making sure sensitive data is protected while also allowing for accountability and transparency.
  13. Regulatory Compliance Reporting: Track and publicly disclose the AI system’s compliance with applicable regulations and standards related to transparency and accountability.
  14. Redress Mechanisms: Implement and publicize mechanisms for users or affected parties to challenge or appeal AI decisions, and track the number and outcomes of these challenges.
  15. Communication Clarity Index: Evaluate the clarity and understandability of any communication about the AI system to its users, especially in terms of its purpose, potential risks, and benefits.
  16. Continuous Review Cycles: Implement regular review cycles to assess and improve transparency and accountability mechanisms based on evolving standards, feedback, and technological advancements.

Challenges Measuring Trustworthiness of an AI System

As you can already imagine, simply looking at the charateristics and providing potential measurements can be an arduous task. This makes it more difficult for an clear set of criteria to be applied generally across different organizations. This is interesting, because we are moving towards a different model of governance, risk, and compliance to include societal influences and potential impacts of the AI System. This point is compelling because it is a gray area in which we would typically rely on an authoritative source to define what needs to be done and how it is measured. This would be correlated with some risk that would need to be acted on. For GRC professionals that would support the initial review of trustworthiness of these systems, we can turn to concepts that we are familiar with.

  1. Documentation
  2. Continuous Monitoring
  3. Remediation
  4. Continuous Improvement

Wrapping Up

Navigating the realm of trustworthiness in AI systems is uncharted territory for many organizations. The intertwining of traditional governance, risk, and compliance (GRC) principles with new-age AI challenges necessitates a fresh perspective and approach. The factors we’ve highlighted—such as documentation, continuous monitoring, remediation, and continuous improvement—serve as starting points. However, as we delve deeper into each factor, the complexity and nuances become increasingly apparent. To truly unlock a comprehensive understanding of this domain and to equip GRC professionals with the tools and knowledge they need, our subsequent article will dissect each of these concepts in detail. Join us as we embark on this journey to unravel the intricacies of measuring AI system trustworthiness, ensuring that the technology we leverage is not only advanced but also ethically and responsibly aligned with our societal values.

Subscribe