What is the Facebook external hit user agent?
You may observe requests with a user agent containing thefacebookexternalhit
substring in your log and wonder if they are all linked to Facebook/Meta. These requests don’t always originate from Facebook. They may also come from the iMessage link preview feature or from an attacker that spoofed its user agent. In this article, we provide more information to distinguish between these different situations.Is the Facebook external hit substring always linked to Facebook/Meta?
NO, not all requests whose user-agent contains thefacebookexternalhit
substring are linked to Meta. Only the requests whose user agents match the following and whose IP addresses belong to AS32934 (Facebook, Inc.) come from Facebook/Meta:-
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
-
facebookexternalhit/1.1
-
facebookcatalog/1.0
Why does Facebook/Meta make requests with facebookexternalhit to my website?
Facebookexternalhit
is the Facebook crawler. It is used to retrieve information about websites or applications that are shared on Facebook. For example, when you copy a link in messenger/facebook, it makes a request with the following user-agent: facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
The request comes from a Facebook IP address, in the case of my experiment 31.13.127.2
, which belongs to AS32934 (Facebook, Inc.).Facebookexternalhit is also linked to iMessage (iPhone message) link preview feature
In your logs you may also see requests with a user agent that look as follows:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0
User agents that contain both the facebookexternalhit
and Twitterbot
substrings are linked to the Apple iMessage application. Whenever you receive a link in a conversation, iMessage triggers a request with the previous user agent to retrieve information such as the title, a short description, and the favicon of the site.Contrary to requests made by the Facebook crawler, these requests come from the end-user IP address. Thus, the IP addresses you observe in the logs will be linked to different (mobile) ISPs such as AT&T, Verizon and Comcast but are not linked to Facebook or Twitter (X).How can I verify if a facebook external hit request comes from Facebook?
Facebook provides a procedure to authenticate its crawlers. As always, you should never rely solely on the user-agent to authenticate a good bot as this HTTP header can be easily spoofed by an attacker. Thus, you should:- Verify that the request user agent has the following pattern
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
,facebookexternalhit/1.1
orfacebookcatalog/1.0
- And verify that the requests originates from AS32934 (Facebook, Inc.). To do that, you can either use the command
whois
command below that returns the list of IP ranges linked to this ASes, or you can use IP-related APIs such as IP Info.
The whois command to retrieve AS32934 (Facebook, Inc.) IP ranges:
whois -h whois.radb.net -- '-i origin AS32934' | grep ^route
It returns different IP ranges (CIDRs) linked to the AS32934:
route: 31.13.24.0/21
route: 31.13.64.0/18
route: 31.13.64.0/19
route: 31.13.64.0/24
...
I'm seeing spikes of traffic coming from facebookexternalhit, is it normal?
The first step is to verify if the requests actually come from the Facebook crawler (cf previous section) or if they come from a malicious bot. Reminder: you should never rely only on the user agent to authenticate a good bot.If the requests come Facebook's autonomous systems, then the spike is not malicious. Note that it may still cause damage to your infrastructure, such as causing a high CPU load and an increase in latency. This is a known issue that has been discussed for years:- Facebook autobot going berserker (2011)
- A Facebook crawler was making 7M requests per day to my stupid website (2020)
- Excessive traffic from facebookexternalhit bot (2013)
- Facebook Crawler Bot Crashing Site (2012)
- facebookexternalhit/1.1 bot Excessive Requests, need to Slow Down (2020)
Conclusion
Requests with a user agent that contains thefacebookexternalhit
substring are either linked to the Facebook crawler or the iMessage link preview feature. The requests that come from iMessage look as follows: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0
. They contain both facebookexternalhit
and Twitterbot
in their user agent.To verify if a request that contain facebookexternalhit
actually comes from Facebook, you should not rely solely on the user-agent. You should also verify that the IP address is linked to the AS32934 (Facebook, Inc.)Other recommended articles
How to securely authenticate Google Read Aloud requests
In this article, we discuss what's Google Read Aloud, how you can authenticate its requests and ensure that it doesn't access paywalled content.
Published on: 02-06-2024
The LinkedInBot
This article provides information about the Linkedin bot, such as its user agent LinkedInBot/1.0 (compatible; Mozilla/5.0; Apache-HttpClient +http://www.linkedin.com) and how you can safely verify that a request originates from the Linkedin bot using reverse DNS.
Published on: 25-04-2024
Go HTTP Client
This article provides about the go-http-client/x.x user agent. It is linked to Go HTTP client, an HTTP(s) client implemented in Golang that can be used to make requests from a Golang program.
Published on: 01-05-2024