How to change the user agent, HTTP headers and use proxies with the Python requests library

Requests (python-requests/2.31.0) is a Python HTTP library that makes it easy for developer to make HTTP(s) requests (GET, POST, etc).

How to install Python requests?

You can install it using the following command

pip install requests

How to make GET requests with Python requests?

The code snippet below shows how you can make a simple HTTP request with the Python requests library to https://deviceandbrowserinfo.com/api/http_headers, prints the status code of the response (200 if successful) and the content of the response.

import requests

response = requests.get('https://deviceandbrowserinfo.com/api/http_headers')

print(response.status_code)
print(response.text)

How to modify the default user agent?

The code above makes a request to https://deviceandbrowserinfo.com/api/http_headers, which returns the list of HTTP headers and their associated value. In the case of Python requests, we obtain the following results:

{
  "Connection": "upgrade",
  "Host": "deviceandbrowserinfo.com",
  "X-Forwarded-For": "xx.yy.zz.aa",
  "User-Agent": "python-requests/2.31.0",
  "Accept-Encoding": "gzip, deflate",
  "Accept": "*/*"
}

We see that by default, Python requests has the following user agent: python-requests/2.31.0. Note that the version in the user-agent depends on the library version.

To change the Python requests user-agent, we need to pass the headers parameter with a User-Agent property when doing an HTTP request:

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15'}
response = requests.get('https://deviceandbrowserinfo.com/api/http_headers', headers=headers)

With the headers parameter, the server returns our new user agent along with the previous HTTP headers:

{
  "Connection": "upgrade",
  "Host": "deviceandbrowserinfo.com",
  "X-Forwarded-For": "xx.yy.zz.aa",
  "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15",
  "Accept-Encoding": "gzip, deflate",
  "Accept": "*/*"
}

How can I change Python requests HTTP headers?

We may want to change all the HTTP headers to appear more human and avoid being blocked (response 403). In this case, we need to provide a headers dictionary that contains all the headers we want to modify. For example, to make it look like the requests are coming from a Chrome browser on MacOS, we could provide the following headers:

headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'Accept-Language': 'en,fr-FR;q=0.9,fr;q=0.8',
    'Connection': 'keep-alive',
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'none',
    'Sec-Fetch-User': '?1',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36',
    'sec-ch-ua': '"Google Chrome";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
    'sec-ch-ua-form-factors': '"Desktop"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"macOS"',
}
response = requests.get('https://deviceandbrowserinfo.com/api/http_headers', headers=headers)

How can I use Python requests with a proxy?

You need to pass a proxies parameter that contains information about your proxies credentials:

proxies = {
    'http': 'http://username:password@proxyserver:port',
    'https': 'http://username:password@proxyserver:port',
}

response = requests.get('https://deviceandbrowserinfo.com/api/ip_address', proxies=proxies)

Does Python requests execute JavaScript?

No. When you make an HTTP requests to a page that also contains JavaScript, Python requests doesn’t execute any JavaScript. It just enables you to retrieve the content of the page (HTML, JS and CSS). If you want to execute JS, you should use a headless browser such as Headless Chrome.

How can I parse HTML with Python requests?

To parse and analyze HTML content with Python requests, you need to leverage the Beautiful Soup library. The example below shows how you can make a request to https://deviceandbrowserinfo.com/learning_zone, extract all the links in the page, and print them.

import requests
from bs4 import BeautifulSoup

response = requests.get('https://deviceandbrowserinfo.com/learning_zone')

soup = BeautifulSoup(response.content, 'html.parser')

links = soup.find_all('a')

for link in links:
    link_text = link.get_text()
    print(link_text)

How can I block requests coming from Python requests?

Block with the user-agent: You can block requests whose user agent contains the python-requests substring. However, you should keep in mind that an attacker can easily change this value.

Block using missing and inconsistent HTTP headers: In case the attacker simply changes its user agent, you can block HTTP requests that claim to come from standard browsers such as Chrome, Firefox, and Safari but that don’t have standard HTTP headers, for example:

  • Missing accept-language
  • Missing client hints, such as sec-ch-ua

You should be careful of potential false positives when taking this kind of blocking decision as there might be edge cases on certain less common (outdated/non-standard) browsers.

Block using TLS fingerprinting: Another solution is to leverage the TLS fingerprint to block values linked to Python requests.

Other recommended articles

How to remove “Chrome is being controlled by automated test software” ?

In this article, we present how you can remove the “Chrome is being controlled by automated test software” warning in Chrome using the ignoreDefaultArgs: ["--enable-automation"] argument.

Read more

Published on: 16-06-2024

Simple Selenium Chrome Crawler (Python)

Tutorial to create a simple scraper/crawler in Python that leverages Google Chrome and Selenium

Read more

Published on: 01-04-2024

How to take good screenshots with Puppeteer?

In this article, we present different options of the page.screenshot function in Puppeteer you can use to take good-looking screenshots.

Read more

Published on: 16-06-2024