*This article is a look into the houses of cards that comprise what most rely on every day for entertainment, information, and daily life. Where possible proper disclosure practices were followed. This article is a continuation of Wardialing 2.0 and Broken Foundations. This article for information purposes only and you are responsible for your own actions and use of this information.*
You might want to grab a beverage of your choice, we are in for a little adventure.
The Stack
With any application or website you view in the modern environment there is often a number of components to make the service and experience possible for the user, which is you. Each service and application may differ but often they do rely on similar infrastructure needs. There is also a lot of misconceptions and marketing speak around clouds and serverless technologies. Let’s clear up a few things, a cloud is nothing more than a host providing resources to be used by an application, webiste, or service. There is still all the components as if they were running their own servers physically just without the overhead. We often also hear the word serverless, this is a misnomer, there is still a machine executing the process it just runs whatever application or work inside of a container. Multiple layers of abstraction but lovely marketing phrase to confuse the masses.
Along with cloud computing and serverless technologies, we also have the seen the rise of cost effective storage. This storage model however, is not on the hosts themselves but often found in buckets. Common examples of this would be the S3 Storage by Amazon Web Services(AWS) and Google Storage by Google Cloud. Each of these services and their providers host a large portion of the internet ecosystem we know today. The reason cloud providers have become so ubiquitous is because for a lot of companies it is much easier to start in this way or to stay on hosted platforms, running your own equipment requires not just a huge cost overhead but also a personnel overhead. You would need a network team, a systems team, a security team, on top of your development and operations teams. Cloud providers strike a chord of balance as they simplify the network, systems, and security side of this equation.
With this balance comes a few risks though, and one of those risks is complete reliance on a provider for your service stack. This could greatly impact your customer/user base as an outage could completely leave them without the ability to use any portion of your service. This is clear when AWS has a problem like they did today.
The Slack
As these companies do help in simplifying onboarding and processes to make launching your application or service as easy as possible, there is a trade off. Often times it is overlooked that at the core there are still machines running operating systems running these processes. Another aspect overlooked is the default settings and configurations of the services are deployed. This can be a little confusing as you are running your own platform on their system, but their system also has it’s own set of rules. An example of this, is the default user permissions to the graphical console for AWS which is handled with IAM, to the configurations for specifics of a service you may be using.
Each service within a cloud provider you use will have a number of different configurations that you can leverage to do a variety of things from access control, scaling, and cost management. This is very critical as you will need to be able to control who access to your systems. Scaling triggers and definitions are heavily relied on so that you can facilitate user growth and do so without needing to impact your applications. This does require some software architecture decisions in order to make them compatible for scaling with use of these triggers. I will discuss this type of design for scaling in a future article. Lastly, cost management, cloud costs can get out of hand really quickly without proper configuration management and access control. This can even sink businesses and has been the ruin of many individuals. there are many horror stories around cloud costs, but it is the compromise that is agreed upon to run on someone else’s equipment.
Common provisioned services exists between most providers. This would include serverless containers, virtual machines, some dedicated hardware, lower cost storage, database hosting services, and cache services are just a few examples. Each of this would have their own set of policies for management and for scaling. I should also mention that there is often a minimum setup that is the default and allows these services to be deployed to and used, this is where the problem begins to creep in.
The Pack
Now we have a little information about some of these types of services and how companies utilize them to provide the modern experiences we know today, let’s start packing this information together to be useful to see just what we can find out about them. As mentioned, each of these services are ready for use and often with robust API’s for development teams to begin using immediately. The company that is using the cloud provider is responsible for their own “infrastructure” and this is where some interesting information can be found. This is also where we can begin our reconnaissance. This step is important and it will allow us to gather useful information about the services, but it will also disclose a number of other interesting facts for us as well.
The API endpoints provided by a particular company will have an endpoint Endpoints can be used to distinguish what destination service you are trying to use as well as whose, this relies on DNS. An example of this and one that we are going to have a little fun with for example will be the following:
- *.storage.googleapis.com
- *s3.us-east-1.amazonaws.com
Both of these endpoints are used for storage of websites and other content. Some may even contain various important information for a particular company. In some rare cases you may even find sensitive and personal identifying information of customers in these storage containers. But how do you find them? Let’s have a little fun.
There are a variety of ways that you can find subdomains, a subdomain is a domain associated to a primary domain name and in this case lets just use the Google subdomain of *.storage.googleapis.com. I could go to Google’s customer page located here. Simply looking through this list one could also gain an understanding about how important branding is for these companies, you can use this for your advantage. For example, Twitter uses Google’s platform. Let’s see is there a Twitter storage subdomain?
This is typical to see branding associated with API’s from big companies to small companies, so nothing out of the ordinary here. Are there some other interesting one’s you can find from that customer list? Sure, as well as AWS customer lists. This information can become vital for discovery and doing your research for how infrastructure is running or for vulnerability assessment. Often times teams will also use definitive ways of identify their own endpoints, with references like -test, -prod, or -dev. This is great for internal use, but when these are on the internet they allow for an attacker to isolate the endpoint that makes the most sense to spend effort on.
Now that we found some large name buckets, is that all of the buckets using these types of naming? No, a lot of random buckets are also used as the organization or developer did not choose to specify a specific endpoint that could be identifiable, which is great operation security. You could use a variety of tools like subfinder, or if command line is not your thing you could use dnsdumpster. You could easily then put in those two example subdomains and find a whole listing of subdomains like the following:
Alright, well now we got us a list of a number of Google storage containers, but if we changed this to the s3 subdomain we would get a similar list for AWS. This is a great grouping of lists to start and this recon is providing us an insight for storage, but what else? Any of the service subdomains could be used like RDS, Calendar, and others. I encourage you to read up on the cloud providers documentation of there services and use their subdomains to do some exploring. With this we could also do an entire subdomain search for a particular domain name if we wanted to a specific target if we chose to do so. Let’s look at the following example:
Now we have identified a number of things about their infrastructure with just a simple subdomain search, in this case we were able to identify Jira, Gitlab and a Single Sign On Service along with other things but those are interesting. Each domain you search will have it’s own listings and this is only a portion. Let’s do keep in mind that if we do a query on these subdomains they can provide us an IP address that we can then query to see where this IP may reside or if it is the final hop. For some of these you will find they are on Cloudfront and some on AWS EC2.
The Track
Now that we have done some digging, we now need to determine what is possible for this and this will require just a little more work. As I have stated before numerous times: “Hacking is 98% research, 1% fortitude, and 1% execution. In this case I am not going to go into a specifics for one company or another but let’s just say that Company A may also run their own instance of Gitlab or Github for storing their codebase as they may for compliance reasons not use Github.com. In some cases companies run their own to prevent accidental disclosures for API Keys, SSH Keys, and credentials. This could end badly if an attacker were ever presented with these, this could become costly, harmful to reputation, intellectual property, and in some cases customers/users as well. There are pros to cons to hosting your own.
The pros of running your own repositories in your company have been explained but what downsides are there? This can be a very slippery slope as they are just as vulnerable if not more than hosted solutions in most cases. Updating software to the latest and patched versions does have a consequence, for a home user this impact is not as painful. For a business, this could prevent a deployment and it could create a downtime. This is why businesses have change release processes for these types of services and they are often the last ones to be patched. This can be a problem as there are vulnerabilities found and exploited daily. Some are reported and many are not, you can find some common vulnerabilities that have been reported here. You can often identify a version of software running on a service by either attempting to make a request via an API call or you could use a web browser to go to destination and port with a non existent endpoint. Often on error, software will present it’s version number, use this for your research.
From the information inside the CVE you could derive a number of attacks potentially and in some cases they can result in exploitation of a system containing very sensitive information such as credentials. This can be devastating to a company and could lead to further exploitation tactics used on other services. This research not only applies to the service mentioned here, this applies to every software. Remember, software is written by humans and humans do make mistakes. Now, that we went down this track to identify a service vulnerability and what we could do with it, let’s get back to looking at a simpler attack. One you could do with very little skill in development, and it could have huge implications for the target bucket owner.
The attack
We went down the track to look at specific service identification and how we could potentially find vulnerable services and what we can do to exploit them. That requires work and it does require having some understanding of software development. What if there was something that required even less knowledge of software development, but could become quite costly or could be used for subversive tactics. Common usage of the above mentioned storage buckets could be for application data storage. this could be used for files between users and could have identifiable file names. In some cases it could be photos, it could any number of things. So how do we access it?
*This information is censored to prevent an attack on the particular company and is for educational demonstration only. YOU ARE RESPONSIBLE FOR YOUR OWN ACTIONS.*
Let’s take a look at a bucket, in this case I will use the following:
https://imgs-****.storage.googleapis.com
I am able to curl a file to this directory, with the following command:
curl https://imgs-****.storage.googleapis.com — upload file $filename
And I can retrieve it with the following curl command:
curl https://imgs-****.storage.googleapis.com/$filename
If you do some recon and want to find this bucket the filename is something, you will find this message. There is also another reason I have censored the location as they have database backups in the same bucket, they have been contacted, however no response. While this seems overly simple and it is just me uploading a file, I can also delete the file with curl and PATCH a file. This could become a big problem for storage buckets improperly protected. Not to just pick on Google here, this is also possible with AWS S3 buckets too.
As we now can see there is some interesting things going on here, one may ask why keep a database backup in such a storage backup? Cost, cost of on virtual machine disk is much higher than the bucket costs. As bucket costs is reduced as long as traffic to access it is also low. This is important for that cost management we mentioned early on. However, the compromise being made here is that often times these companies and service providers we rely on are not following best practices. Providers are not forcing them to follow best practices and this can pose a number of issues including surprise costs. In the case of Google and AWS you can do a lot of storage in this way and even upload illegal content to be obtained by others while a small company is footing the bill.
There is also another attack surface that is being overlooked by most on why I bring this up, that is schools. During covid we have all shifted to remote learning, remote work, and children are doing remote homework and storage. Often times schools are not following best practices and this is because Google and AWS are not enforcing them, this leaves them completely vulnerable to a PATCH to a homework file that now is malware spreading to the entire district. Sounds far fetched? NO, You could easily do it with the information provided in this thread.
The Flack
Not only is Big Tech companies not caring about paying customers, they do not care about the free services they give out. Nothing is free, you are paying for it one way or another. This is critical to understand the dangers of these large companies and their continued abuse of their customers and users. We are so quick to say let’s stop Big Tech and still use their products everyday, in order to create change we have to be the change. I hope that this information can provide you some insight into how terrible companies like Amazon and Google treat their customers that pay them. While one could argue that it is up to the customer to lock their infrastructure down, AWS and Google both do a number of extended handholding processes in other parts of their ecosystem, why not lock down storage? Keep in mind that other services can be exploited in similar ways. This applies to calendars, mail, and other vital functions provided by these providers.
The real problem here is that along with telephone registration flows, reset vectors, and now exposing issues with service providers to which each of these vectors apply to each and every facet of the tech stack. The reality is that with a couple of hours worth of research and experimenting you can create chaos for large providers as well as individuals. This is a huge problem, if this is not going to be addressed then what are they responsible for? They have proven they are not responsible for your data, your safety, and now costs for you using them as a customer. This should be clear as to where the consumer stands with these companies. They are providing a service for people to save money and innovate without the overhead but at a really high cost as we can see. As they lack safeguards for the paying customers now their costs increases could go sky high with attacks against the services.
Conclusion
I would like to thank my friend Geeknik, for providing me information as well as his disclosure to Google, their response was it was working as intended. Please check out Geeknik’s work and support security researchers. I took some of his findings further and did a few experiments against a bucket of my own and was able to successfully perform a zip bomb attack. The research of most of the material for this article was done during October and I finally gotten around to finishing it all up with some examples and information so you could experiment only for educational purposes. There is a lot to unpack in here and if you have questions, feel free to message me.
Recently a large number of services are being exploited, private businesses being impacted, and your data as well. One day these little musings about the matter might make an impact to where it should, but until I will continue to research, test, hack, and write about them. Leaving one Big Tech company to go to another company to make them Big Tech is not how we win. Distributed networks and running our own infrastructure will be the real change. The next article will be about exploiting database services on each of these platforms.
Stay safe out there, happy hacking!
You can find me on Twitter, Signal, and Telegram.
This will be last article on this username on Medium will be moving to @nixops.