A Guide To Identify Authorization Vulnerabilities At Scale Using Semgrep

Introduction

Authorization (henceforth referred to as AuthZ) vulnerabilities such as Horizontal Privilege Escalation, Vertical Privilege Escalation, Insecure Direct Object References (IDOR), Forced Browsing, etc. are generally considered as different flavors of access control vulnerabilities belonging to the vulnerability class Broken Access Control. This is listed as the top vulnerability category according to OWASP Top 10. These issues stem from the fact that there is/are some check(s) missing on the server side that does not validate whether a particular subject (user, service, etc.) is authorized to perform a certain operation (create, read, update, delete, etc.) on an object (database record, resources such as a file inside a S3 bucket, etc.).

The inability to find AuthZ issues at scale appears to plague almost every organization I've worked at so far. And, understandably so, because it is not a straightforward / trivial vulnerability class that could be easily found by scanners as scanners seem to lack the application context and the overall business logic usecase. In my observation, the most efficient way of finding these issues is via manual testing where you spend time understanding the application, the different users and roles and then coming up with different attack scenarios / threat models accordingly. But, manual testing can only get you so far and it is just not possible to have good coverage and find these issues repeatedly and consistently at scale across all your code repositories.

It is further exacerbated by the fact that if these issues are not identified earlier in the Software Development Life Cycle (SDLC), it could end up being really expensive to fix later on. This is because AuthZ issues have the potential to have huge impact on any organization if they were to be exploited by malicious actors. Consider the cost of responding to incidents, PR and legal nightmares, developers spending more time fixing bugs post production as opposed to doing actual development work, etc. Not to mention, if the root cause of AuthZ issues are not identified and addressed earlier on, it could result in the security team playing whack-a-mole trying to fix these issues in an adhoc manner. The security industry, in general, struggles with hiring good security engineers and if we are not effectively utilizing them to eliminate entire vulnerability classes, it feels like a losing battle to me and not able to hollistically reduce the risk to an organization in a reasonable timeframe.

So, how can we make this better? Well, based on my experience, I have come to the conclusion that it is somewhat possible to identify and block these issues at the Pull Request stage if the security team evangelizes usage of certain frameworks that have good secure by default libraries/methods. Depending upon the framework and the recommended security best practice in that framework, there could be certain patterns or ways of doing things that could be leveraged to proactively detect and address such issues. And, the way to go about it is such that you enable the developers with these secure by default methods and monitor any instances when they are not following those or deviating from those standards. The way you monitor such deviations could be via static analysis rules that could be run on every PR and allows you to catch them much before they are deployed in production. It is worth mentioning that leveraging Static Analysis Security Testing (SAST) tools for this approach is the best way to achieve this outcome since you are as close as possible to the root cause of these bugs being introduced in the first place. And, it also allows you to identify these issues at scale across all your code repos in your organization without a lot of manual effort. Having said that, there will always be exceptions but in general, this approach seems to work faily well in my opinion.

In this blog, I will try to walk through some example scenarios and static analysis rules with the hope that it might help you contextualize the idea as per your environment and tech stack. Please note that this is just one way to look at this problem and I am sure there are more ways to think about it but I've had great success identifying critical AuthZ issues using this approach. For the purposes of this blog, I am going to be using NestJS as the web framework and Semgrep as a SAST tool to write custom rules. NestJS has this concept of Guards and I will be focussing on that to identify AuthZ issues. You can think of it like decorators on every API endpoint that your application exposes. I believe most common web frameworks have something similar so the same idea applies to those as well.

So, let's get into it.

Example Scenario

The scenario described below is a very high level one. There are a lot of nuances as and when the application becomes more complex but I will try to keep it such that it is easy to explain the general idea of identifying authZ issues.

You have an environment composed of microservices where teams own and build their individual services in separate repositories (i.e. it is not a monolith) and are responsible for implementing any custom AuthZ logic within their service. Consider an application that has different user personas. For example, lets say it is an application that patients can sign up to receive medical treatments and have the ability to interact with a doctor for their healthcare needs. The doctors also use the same application to work with different patients to provide them care. And, finally there are administrators that use the same application to provide administrative support to both patients and doctors. Every user persona is assigned a JWT post authentication and these JWTs are enhanced with further claims such as user roles, permissions, userIDs etc. So, it is possible to decode the JWT and determine the user type and what they can/cannot do. I won't go into the details of what consitutes a user persona but just assume that there are different roles and it is possible to infer that from the JWT itself on the server side.

Also, in this application, there are certain API endpoints that are supposed to be accessed by only patients whereas certain endpoints could be accessed by doctors, patients, and administrators. Patients should obviously only be able to look up their own records but for administrators, it is possible for them to look up different patients/doctors records. I hope you notice where I am going with this design. Unless there is a good AuthZ strategy defined for this application, it is very much possible to introduce API endpoints that unnecessarily expose sensitive data to unauthorized actors.

So, how do we think about defining the authZ strategy for such an application? Well, the idea is pretty simple. We will use NestJS guards to define individual guards for the different user personas i.e. there will be a PatientGuard, DoctorGuard and an AdminGuard. These guards will have a strategy defined. The strategies are basically code that would validate (decode JWT, read the claims consisting of roles and permissions) what a particular user persona can/cannot do. Each guard will have its own strategy and all of this would be bundled up and made available via a centralized AuthZ module package that different teams could import into their own services. They could then use these guards and add them to the API endpoints that they plan to expose to the different user personas. It is worth mentioning that there is also a NPM package called or-guard that allows the guards to be used in some combination if there is an API endpoint that would require access from different user personas.

From a security perspective, there are some basic things that you could now implement in form of static analysis rules that would allow you to have better visibility and coverage of exposed API endpoints across your organization and further increase the security posture of your applications. Let's look into some of those below.

Example Semgrep Rules

For sake of brevity, I will try to explain the rules and what its doing by focussing on the relevant parts below. For more comprehensive rules, I have provided Semgrep playground links against each rule so that you can play with it and get the idea.

Finding endpoints with no guards

You would want to know if there are any API endpoints that are exposed and have no guards on them i.e. they are exposed to any unauthenticated attacker on the internet. There could be valid usecases for some of these but you would want to start with identifying all of them and then adding the known exceptions using the default deny approach.

Rule Link - no-guards-used-on-query-mutation

rules: - id: no-guards-used-on-query-mutation patterns: - pattern: | class $CN{ ... @$METHOD( ... ) $FN(...){ ... } ... } - pattern-not: | class $CN{ ... @UseGuards(...) @$METHOD(...) $FN(...){ ... } ... } - focus-metavariable: $FN - metavariable-pattern: metavariable: $METHOD patterns: - pattern-either: - pattern: Query - pattern: Mutation - metavariable-pattern: metavariable: $FN patterns: - pattern-not: stubQuery message: Certain mutations/queries are not guarded. languages: - ts severity: WARNING

In the above rule, we are looking for a pattern (in the pattern section) that has functions defined within a class and a method for each function. We are then identifying those instances (in the pattern-not section) where it does not have any guards (via UseGuards declaration) defined. Also, notice that we have added an exception for the stubQuery method (in the metavariable-pattern section) since that is okay to not be guarded and so we don't want our rule engine to flag that particular method. You can test it out using the rule link provided above. This is pretty straightforward and the easiest rule you can implement to find all publicly exposed endpoints.

Finding usages of bespoke (non-standard) guards

The idea behind this approach is that we want developers to only use the standard guards that are approved by the security team. Having said that, it is still possible for developers go about implementing their own guards and strategies but we don't want to encourage that. This rule allows us to identify such instances of using bespoke guards.

Rule Link - bespoke-guard-used

rules: - id: bespoke-guard-used patterns: - pattern-inside: | export class $CLASS { ... } - focus-metavariable: $X - pattern-either: - pattern: | UseGuards(...,$X,...) - metavariable-pattern: metavariable: $X patterns: - pattern-not-regex: (Doctor|Patient|Admin)Guard message: Dont use bespoke guards. languages: - ts severity: WARNING

In the above rule, we are checking for the usage of UseGuards decorator inside the class. And, we only want to flag if the code uses anything besides the three standard guards - Doctor, Patient and Admin. This follows the secure-by-default approach where the intention is to encourage developers to use the standard guards and not go about rolling their own. This is very powerful wrt impacting the engineering culture and being consistent at it.

Finding instances where guards are not used properly

Let's say in the admin user persona, it is possible to have multiple roles such as SuperAdmin, AccountAdmin, etc. So, whenever the AdminGuard is used, if we fail to check for the allowed roles, it would default to giving access to all the roles - which wouldn't be ideal. Therefore, we need to write a rule that would ensure that every usage of the AdminGuard also has a corresponding Role defined with it. And, if it does not, the rule should flag that particular endpoint. The rule below helps us accomplish this.

Rule Link - allowed-roles-for-admin-guard

rules: - id: allowed-roles-for-admin-guard patterns: - pattern-either: - pattern: | class $CN{ ... @UseGuards(...,AdminGuard,...) @$METHOD( ... ) $FN(...){ ... } ... } - pattern-not: | class $CN{ ... @UseGuards(...,AdminGuard,...) @$METHOD( ... ) @AllowedRoles(...) $FN(...){ ... } ... } - focus-metavariable: $FN - metavariable-pattern: metavariable: $METHOD patterns: - pattern-either: - pattern: Query - pattern: Mutation message: Please define the role. languages: - ts severity: WARNING

In the above rule, we are looking for a pattern (in the pattern-either section) that uses the AdminGuard and then identifying those instances (in the pattern-not section) where it does not have the @AllowedRoles defined. This rule also highlights the fact that simply having a guard doesn't cut it in some cases. And, it is possible to catch those instances as well.

Finding instances when guard strategies are not implemented correctly

It is possible that the standard guards and strategies themselves might go through regressions (unknowingly or unintentionally). If something like this happens, it could have a catastrophic implication across your organization because every single dev team relies on the centralized authZ package (that provide the standard guard and strategies) to provide authZ capabilities within their individual microservices. Thus, it is extremely important to have some checks to ensure you catch some of these regressions as early as possible.

As an example, lets assume that the strategy of one of the guards was modified and the audience value was removed (as part of the JWT verification logic) in order to perform some testing locally but this somehow ended up being pushed as part of a PR. If this got merged, it could potentially mean that the audience value is no longer being checked in this particular guards strategy which could have further implications wrt authZ issues. Thus, it would make sense to detect this at the PR and fix it before it becomes a bigger issue across the organizaion.

Along the same lines, consider another example where the strategy of one of the guards was modified to skip checking the payload all together for some local testing i.e. it was simply made like a passthrough removing all security checks and controls. This could again have huge repercussions if it got deployed to production. We can come up with certain rules that would allow us to catch such major regressions and save everybody a lot of time and money.

Rule Link - guard-strategy-audience-missing

rules: - id: guard-strategy-audience-missing patterns: - pattern: | const $OPTIONS = (...) => { return { algorithms: ['...'], issuer: ..., secretOrKeyProvider: passportJwtSecret({...}), ... }; } - pattern-not: | const $OPTIONS = (...,$AUD:$TYPE,...) => { return { algorithms: ['...'], issuer: ..., secretOrKeyProvider: passportJwtSecret({...}), audience: $AUD }; } message: Audience field is missing in the Guard Strategy. languages: - ts severity: WARNING

In the above rule, we are checking for functions where we define the strategy options (in the pattern section) and then identifying those instances that don't have the audience value defined inside it (in the pattern-not section).

Rule Link - validate-guard-strategy-payload

rules: - id: validate-guard-strategy-payload patterns: - pattern-inside: | export class $CLASS { ... } - pattern-inside: | validate(payload: any) { ... } - pattern-either: - pattern: > if (!payload) { throw new HttpException(AUTH_SAFE_ERROR_MESSAGE, HttpStatus.UNAUTHORIZED); } return payload; message: Auth Strategies need to validate the contents of a payload. They cannot simply detect if the payload is present or not. languages: - ts severity: WARNING

In the above rule, we are checking for the function validate being used inside the class with the arguments being payload: any. We are then checking if that payload is checked for additional checks or simply checked for its existence. This is an example of a very custom code but I hope you get the idea. I have seen such regressions happen in real world and to avoid such regressions from happening again, I feel like such rules could be useful.

Finding instances when incorrect guards are imported

Let's say a dev team decides to use their own guards/strategies in their code. Now, we did go over a rule that could help us identify those (bespoke-guard-used rule described above). However, the dev team also decided to name this guard to coincide with one of the standard guards. In such cases, the above rule wouldn't suffice because it simply checks for the correct guard via its name, not via how its imported. To address such edge cases, the rule below could help.

Rule Link - non-nestjs-auth-guard-imported

rules: - id: non-nestjs-auth-guard-imported patterns: - pattern-regex: | import \{ ((Doctor|Patient|Admin)Guard(,\s)?)+ \} from .* - pattern-not-regex: > import \{ ((Doctor|Patient|Admin)Guard(,\s)?)+ \} from \'\@test\/nestjs-auth\'\; message: It appears as if a known guard is used but not imported from the nestjs-auth module. languages: - ts severity: WARNING

In the above rule, we are checking whether the right guards are imported (via pattern-regex) but we are also checking whether they are imported from the correct package or not (via the pattern-not-regex). If they are not, then this rule would trigger and flag those instances.

Finding IDOR vulnerabilities where additional checks need to be performed apart from using the standard guards

Let's say an endpoint uses the PatientGuard but this endpoint also takes in a UserID as an extra argument from the end user (user = patient in this case). Now, we all know we should be treating user input as untrusted and we should be doing our due diligence in ensuring the user input is further validated on the server side to have the right permissions and checks. Otherwise, it could be possible for PatientA to access PatientB's data by simply providing PatientB's UserID in the request. This is an example of a typical IDOR vulnerability.

In this particular case, the onus falls onto the particular dev team to implement the additional authZ logic and its completely possible they might not even think about the IDOR attack scenario. To help the developers, the security team could write a library/function (Let's call it checkUserAccess) that enables developers to use this in their code without having to worry about the additional checks that might not be trivial to them. The rule below helps identify such cases where it checks if an end user input is being considered in an endpoint and if so, whether the endpoint is using the checkUserAccess method or not to perform the additional authZ logic.

Rule Link - identify-patient-guarded-patientinput-authz

rules: - id: identify-patient-guarded-patientinput-authz patterns: - pattern-either: - pattern-inside: | class $CN{ ... @UseGuards(<... PatientGuard ...>) @$METHOD( ... ) $FN(..., $ARG: $TYPE, ...){ ... } ... } - pattern-not-inside: | class $CN{ ... @UseGuards(<... PatientGuard ...>) @$METHOD( ... ) $FN(..., $ARG: $TYPE, ...){ ... this.checkUserAccess($USER, $ARG); ... } ... } - focus-metavariable: $FN - metavariable-pattern: metavariable: $ARG patterns: - pattern: input message: Please confirm that the input value that is user controllable is being checked for proper authorization. languages: - ts severity: WARNING

In the above rule, we are checking if the function takes in an input argument (via the $ARG: $TYPE definition in pattern-inside), and if it does, whether the function implements the checkUserAccess method or not (in pattern-not-inside). If it does not, the rule will flag those endpoints. This is a very powerful way to identify IDOR issues at scale, if you can socialize the correct way of performing authZ checks with the engineering teams.


That's it folks! I hope this was useful and gives you an idea about the sort of checks you could potentially write to identify some of these high impactful AuthZ issues. If you have more ideas or have solved AuthZ issues at scale in your organization, I would love to hear more about it. Please feel free to reach out. Until next time, cheers!!

If you like the content and don't want to miss out on new posts, enter your email and hit the Subscribe button below. I promise I won't spam. Only premium content!