Hello everyone and welcome to the first in a series of blog posts aimed at providing some of the core knowledge to get you started with vulnerability research.

As a warning, this first post is going to be a bit long-winded and will be mostly walls of text, but in general I’m hoping to avoid this for future posts.

First off I’d like to talk a little about myself and my background, not because I particularly like doing so but because I’d like to provide some context around why I’ll write a lot about certain topics, and how I write about them, and why I might write less about others. If you’re not a fan of blog posts that get into this sort of thing (recipe posts, I’m calling you out) feel free to skip ahead

The only time I will write primarily about myself

I’ve had the tremendous privilege to be a life long tinkerer with computers. I was exposed at a very young age to computers starting with an Atari 1200XL and moved into the PC era working my way through DOS, Windows 3.1, 95, etc. My father “introduced” me to Linux around the age of 13 by giving me an old second hand computer, a giant book on Linux that came with a Mandrake Linux CD, and broad orders from my father to turn the computer into a router so we could use our new high speed connection on more than one computer. (Wireless didn’t really exist yet, and home routers cost quite a bit more than this old computer)

Between being given a computer that was essentially mine to tinker with and break without repercussion and my constant desire to get my hands on as many computer games as possible (thanks, ADHD!) I developed a strong affinity for tinkering with software and diving into the seedy underbelly of the relatively young Internet (Piracy is bad, kids!). However, I found programming to be boring and sort of pointless and as a result I generally avoided getting into any Computer Science-y type career until I decided to finally get my undergraduate degree in Information system Security at the tender age of 27.

Other than really knowing my way around a Linux shell I genuinely had little to no experience with programming, and no understanding of the InfoSec industry beyond what I understood as “ethical hacking”, essentially a simplistic view of penetration testing. As I progressed through my education, I quickly began to realize I was fascinated by how software functions (and fails) more than anything else. I spent excessive amounts of time on any assignments involving vulnerabilities, exploitation, or reverse engineering.

Fast forward a bit over 5 years, and here I am tearing software apart and analyzing “known” vulnerabilities for a living, something I thought wouldn’t happen until I had at least 10 years of experience in something like pentesting.

So why did I just talk about all this? Well I want to make it clear that I am, at the end of the day, still rather new to this field. Also, due to the nature of my work and past experience, I am not deeply specialized in any particular area of vulnerability research. However, this allows me to cover (at least) introductory information on a wide range of topics. Also, because I learned most of this stuff quite recently, I will also try to talk about my specific approaches to the learning process as much as possible.

What these posts are for

At this point you may be wondering if you should keep reading because it’s not terribly clear what sort of things I’ll be writing about.

In general I plan to cover:

Core knowledge on a wide range of vulnerability research topics
The basics of many different types of vulnerabilities
Common tools including how to use them, handy tips, and when to use them
Knowledge that isn’t necessarily specific to vulnerability research but is broadly applicable in many research situations
Approaches to problem solving in the context of specific types of vulnerabilities (this will probably come later)
Some stuff specific to N-day research

And on the flip side, for the most part I will not talk about:

Hardcore binary exploitation stuff like bypassing mitigations or chaining vulnerabilities
Heavy details of 0day research - I’ll do some introductory stuff and try to provide some good references for further learning
Stuff that is primarily hardware based - this is definitely not in my area of expertise

Outside of this introductory post, the goal is to provide lots of hands-on practice. As a result, you'll get the most out of these posts if you have some basic experience with programming and networking concepts, and a little bit of assembly knowledge can't hurt, but I'll try to avoid assuming specific knowledge and will provide references and sources to learn more whenever possible.

The world of vulnerability research

When I talk about vulnerability research in the general sense, I'm referring to basically any activity that involves finding or understanding bugs in software and hardware that potentially have an impact on the "security" of the research target. A security impact doesn't necessarily mean some sort of code execution; it can be any bug that affects the confidentiality, integrity, or availability of the target. This may sound familiar if you've seen the Common Vulnerability Scoring System (CVSS) or have heard of the "CIA triangle" in a general security context.

I tend to split the field of vulnerability research into two main subcategories: 0-day (or 0day?) research and N-day research.

0-day research

Looking for new individual vulnerabilities that belong to a known "class" of vulnerabilities in a specific piece of software
- There's a lot of way to accomplish this but it can involve tasks such as "fuzzing" or reading code to manually find bugs
Discovering entirely new types of vulnerabilities
- Obviously this is rare, but some of this sort of stuff comes from academia.

N-day research

Analyzing "known" vulnerabilities
- Vulnerabilities that we know exist even if we have zero details beyond "a vulnerability exists".
Using the analysis for a few primary purposes
- Writing exploits - (pentesters, this is a GOLD MINE for you)
- Formulating some sort of detection strategy for exploitation attempts
- Discovering root cause and coming up with patch/mitigation strategies

There are certainly other applications of N-day research such as coming up with vulnerability prevention strategies in a development context but I'll primarily focus on what I've mentioned above.

While there is quite a bit of overlap in the general knowledge required for both 0-day and N-day research, the types of activities you will spend the majority of your time on will tend to be quite different between the two.

Vulnerability categorization as a means of directing research

This is not meant to be an exhaustive list whatsoever. It's meant as a starting point to familiarize you with some terminology in case you want to go search for more information.

I personally like to categorize vulnerabilities in ways that help decide the tools and techniques needed to facilitate research, as opposed to immediately focusing on the specific type of vulnerability. In 0-day research you may or may not be targeting a specific type of vulnerability, and for N-day research you may not know the type until you've spent a lot of time investigating, so focusing on the specific vulnerability type isn't always feasible or helpful right away. It's also important to note that some vulnerabilities may fall into multiple categories that I outline below due to their complex nature.

First, let's look at the types of code you're likely to encounter.

Vulnerabilities by "code type"

Machine code

Instructions in "machine language" that are specific to a particular CPU architecture such as x86, x64 or ARM.
Often this code was compiled from a higher level language such as C, or C++
Analysis generally involves reading "Assembly language", a way of representing machine language using mnemonics instead of having to memorize the binary values of instructions
Technically compiled open source applications could fall into this category, but I prefer to put them in the same category as "scripting" languages
- You will likely analyze the application using source code as opposed to looking at the machine code

Byte code

Instructions in an "intermediate language" that rely on execution in a "virtual machine" to convert the intermediate language to CPU specific instructions
Java and .NET are the most common languages that take this approach
Quite often you can decompile these to obtain near-original source code - but there are some limitations to this

Scripts or open source

You have all the original code and can run and debug it as you see fit
Lots of web-based stuff falls into this category - PHP, Ruby, Python, etc.
Generally the "easiest" type of target to work with - but not always :)

Next, I like to categorize based on the attack vector. This is somewhat similar to the "Attack Vector" metric in CVSS but my categorizations are again based on how it affects the approach to research and the tools I'm likely to need.

Also, these are how I personally like to categorize stuff. You may or may not find this useful, and you may want to come up with different categories for your own purposes.

Vulnerabilities by attack vector

File-based

I generally reserve this category for vulnerabilities involving the structure of a given file type
Files can be binary or text-based (and sometimes both!)
Analysis may involve reading through specification documents for the file format or deriving the structure for an undocumented format
I tend not to include vulnerabilities involving running a crafted program (like for a privilege escalation vulnerability) as those are more about the actions taken by the program as opposed to the structure

Web-based

Vulnerabilities in a web application of some sort
- As in, applications you interact with primarily using a web browser.
The vulnerabilities are as a result of something that the web application handles incorrectly.
This may also include a vulnerability in the server the application is running on, if the vulnerability affects the security of the web application and not necessarily the security of the server software itself (confusing, I know.)

Network protocol based

Vulnerabilities as a result of sending specific data via some network protocol
Could be a vulnerability in the handling of the protocol "messages" or the specific data contained within the "messages"
Analysis may again involve reading through spec documents for the protocol or trying to figure out an undocumented protocol

There are certainly vulnerabilities that don't neatly fall into these categories. For example, I don't really know where to place web browser vulnerabilities, although in theory you could call it file-based, or is it web-based? Who knows? ¯\_(ツ)_/¯

Finally let's look at a “quick” rundown of the more common specific vulnerability types, lumped into a category when possible. There’s a lot here but if you’re quite familiar with vulnerabilities in general, you can probably skip this :)

Common vulnerability types

It's important to note that some types of vulnerabilities may neatly fit exclusively into one of the code type or attack vector categories above and I will mention when that's the case. However, this is not to say they can't ever appear in other situations.

Memory corruption - generally found in binary applications

Buffer overflows - A memory buffer has space to store a certain number of bytes and it's possible to write data outside of that buffer, corrupting other structures stored in adjacent memory
Out-of-bounds read - similar to the buffer overflow, but instead of writing data where we shouldn't, we're reading data we shouldn't.
Null pointer dereference - an attempt to read memory at an address of "0" which may cause a crash if not handled properly. (Denial of service is definitely a "security" issue in many cases!)
Use-after-free - An allocated buffer was "freed", but a reference to a memory location in the buffer gets used after the fact. This means the memory location could contain something entirely different from what's expected.
Uninitialized pointer - The program expects a certain memory location to contain the address of another memory location, but the location was never set and could point to a memory location controlled by an attacker.
Integer overflow (and underflow) - An integer “wraps around” to either become a very large or very small number
- If you have a 1 byte integer with a value of 0xFF and you add 2 to it, it becomes 0x01.
- This isn’t really “memory corruption” but usually this leads to some other sort of vulnerability like a buffer overflow - in the above example if that value was used to allocate a buffer that was supposed to be 0xFF+2 in size, it instead allocates a buffer with a size of 0x01.

Web “only”

Cross-Site Scripting (XSS) - When an attacker is able to get arbitrary Javascript to run in the browser of a user of the web application
- Reflected XSS is when a user is tricked into sending a crafted request to the web application (e.g. by clicking a malicious link) and the web application renders some malicious Javascript in the response as a result of the request
- Stored XSS is when the server stores the “bad” value that causes the malicious Javascript to be rendered each time a user visits the affected page
- The purpose of the bad Javascript may be something like stealing session cookies or re-writing portions of the web page
Cross-Site Request Forgery - Similar to reflected XSS, an attacker gets a user of a vulnerable web application to send a crafted request. But this time the attacker just wants the user to to interact with the application in a way the attacker chooses. Basically, they’re trying to get a user to do something they probably don’t want to do.
HTTP Request Smuggling - This one is a bit tricky to explain in a concise way so I’ll let the wonderful people at PortSwigger handle this one: https://portswigger.net/web-security/request-smuggling

Denial of service

Memory exhaustion - An attacker can cause the application to use so much memory that the application and/or operating system either crash or become unresponsive
Infinite loop - Some condition causes the application to enter an infinite loop, which can cause excessive CPU usage or make the application otherwise unavailable to handle requests
Algorithmic complexity - Those of you who took an algorithms class in school will know this one. Basically the way the application processes data is very inefficient with certain input.
- I like to think of this as sort of an “amplification” attack - send a small crafted request that causes the same load as many many more requests.

Vulnerabilities that don’t fit into a single category

Injection attacks - The application constructs some sort of a “command” string, such as a SQL query or an operating system command, using input provided by a user but doesn’t use the input in a “safe” way.
- Some of these are most commonly found in web applications but there are certainly instances of them using other network protocols
Insecure Deserialization - an application lets a user provide a “serialized” object, basically a specific instance of a class in a language like Java or C#, and the act of deserializing that object causes some undesirable behaviour.
- Most of these involve Java, but .NET deserialization is fairly prevalent and in theory you can find this type of vulnerability in any language that provides a serialization mechanism.
XML External Entity Expansion (or injection?) (XXE) - This one is specific to parsing XML but is fairly common
- An XML entity is sort of like defining a macro, but if the XML parser isn’t configured properly that entity can reference an external source for its value so when the entity is referenced (I.e. the “macro” is used in the XML file) an attempt is made to get a value from that external source
- There’s a few different types of XXE depending on the application - PortSwigger again has a decent explainer on this: https://portswigger.net/web-security/xxe
Server Side Request Forgery (SSRF) - tricking an application into sending a request to another (probably “internal”) service that the attacker wouldn’t normally be able to send requests to.
- Again, this is most commonly a web-based vulnerability but not exclusively so
- SSRF can be caused by other vulnerabilities like HTTP Request Smuggling or XXE

Where do we go from here?

Everything I've provided above is primarily meant to get you thinking about the sorts of things that may interest you and give you some possible starting points for your own journey.

That being said, the next post is going to focus entirely on debugging. We'll have a look at the basics of debugging in a variety of situations and on a variety of targets. The goal is to provide you enough information to set up a debugging environment, since it's not always straightforward, and use the basic functionality of several popular debuggers.

As a closing note, I don't have a fixed schedule for how often these posts will come. It's very much a "when it's done" sort of thing, but rest assured, if I say I'm going to post about something, I'll do it eventually!

I hope you found this post at least somewhat useful. Please feel free to reach out to me on Twitter or Mastodon (see the links top right) with any comments or constructive feedback and I look forward to sharing as much of my knowledge as I can!