Since ChatGPT came onto the scene in late 2022, test after test has proven vulnerable to the wiles of generative AI. The initial GPT-3.5 model was impressive enough, and the more advanced GPT-4 has shown an even greater proficiency for test-taking. Name a large, well-known test, and ChatGPT has probably passed it. In addition to bar exams, SATs, and AP exams, ChatGPT has also passed 9 out of 12 AWS certification exams and Google’s L3 engineer coding interview.
At HackerRank, we’ve seen firsthand how AI can bypass MOSS code similarity, the industry standard for coding plagiarism detection.
All of these sudden vulnerabilities can seem scary for those administering tests. How can you trust the answers you’re getting? If your tests rely heavily on multiple choice questions, which are uniquely vulnerable to large language models, how can you revise test content to be more AI resistant?
These developments are worrying for test-takers, as well. If you’re taking a test in good faith, how can you be sure you’re getting a fair shake? Interviewing is stressful enough without having to wonder if other candidates are seeking an AI-powered advantage. Developers deserve the peace of mind that they’re getting a fair shot to showcase their skills.
At HackerRank, we’ve done extensive testing to understand how AI can disrupt assessments, and we’ve found that AI’s performance is intrinsically linked with question complexity. It handles simple questions easily and efficiently, finds questions of medium difficulty challenging, and struggles with complex problems. This pattern parallels most candidates’ performance.
However, creating increasingly intricate questions to outwit AI isn’t a sustainable solution. Sure, it’s appealing at first, but it’s counterproductive for a few reasons.
Instead, our focus remains on upholding the integrity of the assessment process, and thereby ensuring that every candidate’s skills are evaluated fairly and reliably.
Upholding integrity means being realistic—and transparent. This means acknowledging that there are assessment questions that AI can solve. And it means alerting you when that is the case, so you can make informed decisions about the content of your assessments.
That is why we are introducing an AI solvability indicator.
This indicator operates on a combination of two criteria.
If a question is not solvable by AI, it does not get flagged. Likewise, if a question is solvable, but the answer triggers our plagiarism detection model, it does not get flagged. The question may be solvable, but plagiarism detection ensures that the integrity of the assessment is protected.
If a question is solvable by AI and the solution evades plagiarism detection, it will get flagged as AI Solvable: Yes. Generally, these questions are simple enough that the answers don’t generate enough signals for plagiarism detection to be fully effective.
Questions flagged as AI solvable will be removed from certified assessments, but may still appear in custom assessments, particularly if those assessments have not been updated in some time.
If you’re browsing through questions, you can also select to hide all AI-solvable questions, just as you can hide all leaked questions.
Beyond the transparency of the AI solvability indicator, we are building in measures to actively ensure assessment integrity. These include:
No matter where your company stands on AI, we believe it’s best to be transparent about its capabilities. Yes, AI can solve simpler technical assessment questions. We prefer you to know that so that you can take informed actions.
So what can you do? Every company is coming at AI in their own way, so there’s no one right answer. What works for one organization may not work for another. But broadly speaking, here are some steps you should consider to protect the integrity of your assessments.
Ensuring assessment integrity in a time of rapidly advancing AI can seem difficult. You can only dial up question complexity so far before it starts to degrade the assessment experience and even compromise the value of assessments in finding qualified talent. That’s why we’re focused on reinforcing key pillars of assessment integrity, including our industry-leading AI-powered plagiarism detection, certified assessments, and solvability indicators that give you the transparency and signals you need to make the best decisions about your assessments.
Be sure to check out our plagiarism detection page to go into more detail about how HackerRank is ensuring assessment integrity.