Ever since AI models have rendered biased results and have caused a major deal of dissatisfaction, panic, chaos, and insecurities, “Explainability” has become the buzz word.
Indeed it’s genuine and a “Must-have” for an AI based product. The user has the right to question, “Why?” and “How?”. But how much of these queries are enough to set “Explainability” score? In other words, how much of the response to such queries by the model are enough to exceed “Confidentiality” threshold?
For an ordinary user, may be a satisfactory response is enough as an explanation. But it’s not enough for a curious user, or may be someone who wants to know the trade secret. To what extent can the response provided be deemed as explainable and yet not exposing critical information? Can the model sense the ulterior motives of the person asking queries?
In an earlier post, I had mentioned how queries can be restricted by including pay per extra question or maybe by limiting the attempts of queries. But are these settings made by investigating from the perspective of a hacker? Hackers can be of any range and perhaps one hacker can get all he/ she needs with 5 queries and another hacker may get all he/ she needs with 2 selective and strategic queries.
Furthermore, if models are trained to refrain from responding to certain queries or certain amount of queries, how is explainability affected? For example, for a user, a recommender system plays some music exactly as per his/ her need and in certain scenarios the App plays completely against the user’s choice. In such a scenario how can the recommender system explain the result? If the result implicitly includes certain feature specific information and yet that was supposed to be a part of the proprietary algorithm, will that be explainability or breach of confidentiality?
On the other hand, if a hacker queries an AI based App and the App obstructs the response at a certain query, will that not be a cue for the hacker to understand that the answer lies in his/her obstructed query? And subsequently he/ she can frame strategic queries from multiple accounts to gather hidden answers to all obstructed queries? Doesn’t too much of confidentiality here end up being too much explainable?
The conundrum of reaching an optimum point where it is explainable yet not exposing, is yet to be solved with strategic methods.
Can you think of any?