Skip to content

DoS fix / regex performance tuning#368

Merged
tsellers-r7 merged 6 commits intorapid7:masterfrom
tsellers-r7:regex_performance
Aug 2, 2021
Merged

DoS fix / regex performance tuning#368
tsellers-r7 merged 6 commits intorapid7:masterfrom
tsellers-r7:regex_performance

Conversation

@tsellers-r7
Copy link
Copy Markdown
Contributor

@tsellers-r7 tsellers-r7 commented Jul 29, 2021

Motivation and Context

TL:DR - Fixed an unlikely DoS condition in a pattern in xml/ssh_banners.xml and improved performance of certain patterns. Metrics included here. This PR doesn't address all of the slower patterns but does get the worst of them.

In a PR ( rapid7/recog-java#7 ) for recog-java @hudclark provided some data ( rapid7/recog-java#7 (comment) ) that would cause one of our regexes to hang indefinitely. That specific problem has already been fixed #353.

Since recog is generally used to process data from untrustworthy sources I wanted to make sure that this specific data and certain other patterns wouldn't cause indefinite hangs or otherwise have unacceptable performance characteristics when processed using any of recog's other fingerprint patterns. In Project Sonar we process 100's of millions of banners on a regular basis so small performance improvements can add up.

Findings:

  • one case of regex that resulted in an indefinite hang when processing large numbers of spaces
  • 50+ cases where performance could be up to 50x slower than it should be when processing certain strings.

Testing methodology

Script that processes a specific test string with every fingerprint pattern currently in recog. Since the process can be incredibly fast the script timed how long it took to perform this process 100,000 times per pattern. The script then emitted results sorted by duration with the highest duration first.

There are 3,879 patterns currently in recog. Using @hudclark's test string I found that 3,338 took less that 0.1 seconds to process. The longest duration was nearly 9 seconds. Using that as my benchmark I decided that anything over 2 seconds should probably be reviewed. There were 151 of these in this dataset alone. I did not address all of them because later tests revealed patterns that took up to 62.2 seconds and thus took priority.

First test case - hudclark's

The data that hudclark provided was string ~1,500 characters long that consisted solely of digits. It pointed out some patterns that took 2x to 4x longer than I'd like. These were quickly overshadowed by later tests.

Original results

Click to expand results!
8.9967840140    smtp_banners.xml : Some unknown mail server on OpenVMS
7.8108736030    smtp_banners.xml : Postfix - Ubuntu, Mail-in-a-Box package
6.1197626160    smtp_banners.xml : Exim - with version string and optional timestamp
5.8996496320    apache_os.xml    : Sun Cobalt RaQ (Red Hat based Linux)
5.0258794260    ssh_banners.xml  : Attachmate Reflection (formerly F-Secure SSH)
4.8205517380    ssh_banners.xml  : SSH Communications Security Tectia Server - branded
4.7821866920    ftp_banners.xml  : VxWorks on Tenor MultiPath with version information
4.5126619130    html_title.xml   : HPE ProCurve Switch w/Hostname
4.4242824360    ftp_banners.xml  : Simple tnftpd banner with a version
4.3949423450    ftp_banners.xml  : FTPD on Mac OS X Server without a version
4.3856097340    ftp_banners.xml  : FTPD on Mac OS X Server with a version
3.9750543660    smtp_banners.xml : Exim - with digit only version string and optional timestamp
3.8346248230    smtp_banners.xml : Exim - without version string and with optional timestamp
3.7241216040    ssh_banners.xml  : GlobalScape SSH (which uses Bitvise sshlib)
3.6714761190    smtp_banners.xml : Exim - with version string and optional timestamp (Ubuntu)
3.6398599910    smtp_banners.xml : Lotus Domino SMTP MTA
3.6345964780    smtp_banners.xml : IBM Domino SMTP MTA
3.6345602600    smtp_banners.xml : MailEnable - Complex
3.6194079850    smtp_banners.xml : Microsoft IIS builtin SMTP service - Windows Server 2016
3.5649147010    smtp_banners.xml : ArGoSoft Mail Server - freeware version
3.5586633600    smtp_banners.xml : Microsoft IIS builtin SMTP service, or Microsoft Exchange Server (they are differentiated from each 
3.5554307840    smtp_banners.xml : Microsoft IIS builtin SMTP service - Windows Server 2019
3.4988867800    ftp_banners.xml  : WS_FTP FTP Server on Windows - X2 variant
3.4517308090    smtp_banners.xml : MailEnable - Simple
3.4345158460    smtp_banners.xml : ArGoSoft Mail Server - Pro version
3.2968048820    smtp_banners.xml : Barracuda Email Security Gateway - physical or virtual appliance
3.0335567370    smtp_banners.xml : Postfix - generic banner
2.9669229730    ssh_banners.xml  : MOVEit DMZ (which uses Bitvise sshlib)

Results after changes

Click to expand results!
4.7759784650    apache_os.xml             : Sun Cobalt RaQ (Red Hat based Linux)
3.1016591790    smtp_banners.xml          : Some unknown mail server on OpenVMS
3.0229940320    smtp_banners.xml          : Postfix - Ubuntu, Mail-in-a-Box package
2.9605090640    ftp_banners.xml           : WU-FTPD on various OS
2.0218865210    smtp_banners.xml          : Some simple PERL SMTP server
1.9946286070    smtp_banners.xml          : Exim - with version string and optional timestamp
1.9475264900    snmp_sysdescr.xml         : IBM AIX 5.1 on PowerPC - network software variant
1.9208211020    snmp_sysdescr.xml         : IBM AIX 4.2 on PowerPC
1.9114472270    snmp_sysdescr.xml         : IBM AIX 7.1 on PowerPC
1.9111828570    snmp_sysdescr.xml         : IBM VIOS 6.1 on PowerPC
1.9071520260    snmp_sysdescr.xml         : IBM AIX 5.3 on PowerPC
1.8773198910    snmp_sysdescr.xml         : IBM VIOS 5.3 on PowerPC
1.8667792300    snmp_sysdescr.xml         : IBM AIX 6.1 on PowerPC
1.8612884210    snmp_sysdescr.xml         : IBM AIX 5.2 on PowerPC
1.8287487190    snmp_sysdescr.xml         : IBM AIX 5.1 on PowerPC
1.8107181170    snmp_sysdescr.xml         : IBM AIX 4.3 on PowerPC
1.6810278930    html_title.xml            : HPE ProCurve Switch w/Hostname
1.6473212850    operating_system.xml      : Linux catch-all
1.6397817460    smtp_banners.xml          : ArGoSoft Mail Server - Pro version
1.5850398950    ftp_banners.xml           : FTPD on Mac OS X Server with a version
1.5735731000    ftp_banners.xml           : FTPD on Mac OS X Server without a version
1.5482504690    ftp_banners.xml           : Simple tnftpd banner with a version
1.4420507300    smtp_banners.xml          : Microsoft IIS builtin SMTP service - Windows Server 2019
1.4401679190    smtp_banners.xml          : Exim - without version string and with optional timestamp
1.4353632330    smtp_banners.xml          : Exim - with digit only version string and optional timestamp
1.4187387360    smtp_banners.xml          : Microsoft IIS builtin SMTP service, or Microsoft Exchange Server (they are differentiated from each
1.3978994670    smtp_banners.xml          : Exim - with version string and optional timestamp (Ubuntu)
1.3971444610    smtp_banners.xml          : IBM Domino SMTP MTA
1.3892989060    operating_system.xml      : Vendor-based Linux catch-all
1.3832903850    smtp_banners.xml          : Microsoft IIS builtin SMTP service - Windows Server 2016
1.3652716290    smtp_banners.xml          : Lotus Domino SMTP MTA
1.3557667300    smtp_banners.xml          : MailEnable - Complex
1.3427350980    snmp_sysdescr.xml         : ADTRAN TotalAccess shelf
1.3179144220    smtp_banners.xml          : ArGoSoft Mail Server - freeware version
1.3018863310    smtp_banners.xml          : MailEnable - Simple
1.2340919060    http_wwwauth.xml          : Generic F5 Big-IP
1.2201002650    smtp_banners.xml          : Barracuda Email Security Gateway - physical or virtual appliance
1.2120660080    snmp_sysdescr.xml         : Troy PocketPro Print Server
1.2079617100    ftp_banners.xml           : WS_FTP FTP Server on Windows - X2 variant
1.1994696540    http_wwwauth.xml          : HP Instant Support Enterprise Edition with a hostname
1.1949410020    http_wwwauth.xml          : SPIP publishing system (www.spip.net)
1.1658059770    smtp_banners.xml          : AppleShare IP Mail Server
1.1290798350    smtp_banners.xml          : Non-specific banner with optional hostname
1.0990665840    smtp_banners.xml          : MDaemon mail server
1.0765341810    smtp_banners.xml          : Postfix - generic banner
1.0726890780    smtp_banners.xml          : Seattle Labs SLMail server for Windows NT/2k (v2.7 runs on Win9x)
1.0614999190    smtp_banners.xml          : Sendmail - with date, w/o version or platform, optional status string.
1.0576200570    smtp_banners.xml          : Postfix - generic w/o ESMTP
1.0459751810    smtp_banners.xml          : MDaemon mail server - with version revision
1.0398510560    smtp_banners.xml          : MDaemon mail server - with service pack
1.0357135650    imap_banners.xml          : CMU Cyrus IMAP
1.0201472830    smtp_banners.xml          : MDaemon mail server - without timestamp
1.0141742870    smtp_banners.xml          : SonicWall Email Security
1.0035786600    smtp_banners.xml          : MDaemon mail server - with timestamp, unre

Second test case - High numbers of spaces

The second test was high numbers of spaces followed by various characters.

" " * 10000 + "1!(&(HFUH*&GEGG#((@*#(@&#H@H 37H7293H423H4H H&#(@$&H#$H@#$"

This found some really poorly performing patterns.

Original results

Click to expand results!
56.0342034670   telnet_banners.xml        : ACT Security IP Cameras
55.8205223180   telnet_banners.xml        : Grandstream IP Cameras
37.8397985290   apache_os.xml             : Sun Cobalt RaQ (Red Hat based Linux)
25.0883285960   operating_system.xml      : Vendor-based Linux catch-all
21.9379492780   http_servers.xml          : AVM FRITZ! devices of various types
15.9361667600   operating_system.xml      : Many BSD family OSes
12.9748227420   operating_system.xml      : Linux catch-all
11.9357084810   smtp_banners.xml          : Cisco PIX firewall MailGuard banner stripping
11.9207645880   telnet_banners.xml        : Arescom System
11.8829591350   imap_banners.xml          : CMU Cyrus IMAP on Mac OS X
11.8625381440   html_title.xml            : Emerson Network Power IntelliSlot Web Card and rebrands
11.8392341890   html_title.xml            : Digi Terminal Servers
11.8375095510   imap_banners.xml          : CMU Cyrus IMAP
11.7661826580   html_title.xml            : Synology DiskStation
10.1009548480   snmp_sysdescr.xml         : SGI IRIX64
10.0152383050   http_wwwauth.xml          : Generic F5 Big-IP
9.9775247190    http_wwwauth.xml          : HP Instant Support Enterprise Edition with a hostname
9.9719278960    http_wwwauth.xml          : SPIP publishing system (www.spip.net)
9.8582514080    snmp_sysdescr.xml         : Brocade VDX Switch w/Hostname and BR prefix
9.8163400080    snmp_sysdescr.xml         : Brocade VDX Switch w/Hostname
9.6856156550    ftp_banners.xml           : MikroTik with description
9.6665103310    snmp_sysdescr.xml         : SGI IRIX
9.4721248390    html_title.xml            : Jenkins Customized Dashboard
6.7060986900    ftp_banners.xml           : APC device
6.5243437520    snmp_sysdescr.xml         : CA SystemEDGE Management Agent
6.4842920340    ssh_banners.xml           : Attachmate Reflection (formerly F-Secure SSH)
6.4619459850    ssh_banners.xml           : SSH Communications Security Tectia Server - branded
6.4498859630    ssh_banners.xml           : VanDyke VShell
1.6392240520    smtp_banners.xml          : Lotus Domino SMTP MTA
1.6287923250    smtp_banners.xml          : IBM Domino SMTP MTA
1.1259730260    ntp_banners.xml           : Isilon OneFS NTP Server

Results after changes

Click to expand results!
4.0261433210    apache_os.xml             : Sun Cobalt RaQ (Red Hat based Linux)
1.6170253780    telnet_banners.xml        : ACT Security IP Cameras
1.4962433150    telnet_banners.xml        : Grandstream IP Cameras
1.4914041500    operating_system.xml      : Linux catch-all
1.3436797230    smtp_banners.xml          : Cisco PIX firewall MailGuard banner stripping
1.1601262450    ftp_banners.xml           : MikroTik with description
1.1225218650    http_wwwauth.xml          : SPIP publishing system (www.spip.net)
1.1219601170    http_wwwauth.xml          : Generic F5 Big-IP
1.0966860850    http_wwwauth.xml          : HP Instant Support Enterprise Edition with a hostname
1.0772802380    html_title.xml            : Jenkins Customized Dashboard
0.9965503540    ftp_banners.xml           : Lexmark printer with OS version
0.9814544530    ftp_banners.xml           : Lexmark printer
0.9703127250    ftp_banners.xml           : Lexmark Optra Printer
0.9638518870    apache_os.xml             : Red Hat Fedora
0.9628669260    ntp_banners.xml           : Greyware Automation Products, Inc. Domain Time II on Windows Server 2003
0.9626863580    apache_os.xml             : Novell SuSE Linux
0.9587067290    apache_os.xml             : Turbolinux
0.9577916020    apache_os.xml             : CentOS Linux
0.9524640540    apache_os.xml             : White Box Enterprise Linux
0.9481747460    apache_os.xml             : Red Hat Linux

It also found an indefinite hang in the following fingerprint.

  <fingerprint pattern="^([\s]*)\s*VShell$">
    <description>VanDyke VShell</description>
    <param pos="1" name="service.version"/>
    <param pos="0" name="service.vendor" value="VanDyke Software"/>
    <param pos="0" name="service.family" value="VShell"/>
    <param pos="0" name="service.product" value="VShell"/>
    <param pos="0" name="service.cpe23" value="cpe:/a:vandyke:vshell:{service.version}"/>
  </fingerprint>

The problem here, besides the fact that it doesn't actually match what it should match, is that there are two back to back regex patterns, ([\s]*), and \s*, that match arbitrary length strings consisting solely of whitespace. When processing data that starts with a very long string of whitespace this causes catastrophic backtracking to occur ultimately resulting in a DoS.

This has been fixed so as to actually correctly capture a version ( ([\s]*) -> ([\d.]{0,8})) and I have bounded both the capture and the number of spaces that follow. I've also located and added an example so that it can be tested.

Third test case - High numbers of digits

The third test was high numbers of digits followed by various characters.

"1" * 10000 + " 1 1 1 1 1 1 1!(&(HFUH*&GEGG#((@*#(@&#H@H 37H7293H423H4H H&#(@$&H#$H@#$"

This found some really poorly performing patterns.

Original results

Click to expand results!
62.2014746590   smtp_banners.xml          : Exim - with version string and optional timestamp
52.8767169970   smtp_banners.xml          : Exim - with version string and optional timestamp (Ubuntu)
52.8208191560   smtp_banners.xml          : Exim - with digit only version string and optional timestamp
52.7975289820   smtp_banners.xml          : Exim - without version string and with optional timestamp
52.3167042240   ftp_banners.xml           : WU-FTPD on various OS
51.4864477590   smtp_banners.xml          : Some unknown mail server on OpenVMS
46.1896201370   smtp_banners.xml          : Postfix - Ubuntu, Mail-in-a-Box package
30.9910663220   smtp_banners.xml          : Some simple PERL SMTP server
29.3768304260   ftp_banners.xml           : VxWorks on Tenor MultiPath with version information
27.5495923040   ftp_banners.xml           : FTPD on Mac OS X Server without a version
27.3867953350   ftp_banners.xml           : FTPD on Mac OS X Server with a version
27.2770124840   ftp_banners.xml           : Simple tnftpd banner with a version
27.1238213490   html_title.xml            : HPE ProCurve Switch w/Hostname
23.1229762360   smtp_banners.xml          : Microsoft IIS builtin SMTP service - Windows Server 2019
22.7525733850   smtp_banners.xml          : Microsoft IIS builtin SMTP service - Windows Server 2016
22.6087005870   smtp_banners.xml          : Microsoft IIS builtin SMTP service, or Microsoft Exchange Server (they are differentiated from each
21.7201759360   ssh_banners.xml           : GlobalScape SSH (which uses Bitvise sshlib)
21.1139485530   smtp_banners.xml          : MailEnable - Complex
20.6582233320   smtp_banners.xml          : ArGoSoft Mail Server - Pro version
20.6220830090   smtp_banners.xml          : MailEnable - Simple
20.5010151230   smtp_banners.xml          : ArGoSoft Mail Server - freeware version
19.9526087580   ftp_banners.xml           : WS_FTP FTP Server on Windows - X2 variant
19.7185171060   smtp_banners.xml          : Barracuda Email Security Gateway - physical or virtual appliance
18.2195681420   smtp_banners.xml          : Sendmail - with date, w/o version or platform, optional status string.
18.1322891150   smtp_banners.xml          : Postfix - generic banner
17.5702630040   ssh_banners.xml           : MOVEit DMZ (which uses Bitvise sshlib)
17.4410302960   smtp_banners.xml          : Non-specific banner with optional hostname
17.3782624710   ssh_banners.xml           : Bitvise WinSSHD (which uses Bitvise flowssh) without version
17.3029551180   ssh_banners.xml           : Bitvise WinSSHD (which uses Bitvise sshlib)
17.2932633670   ssh_banners.xml           : Bitvise WinSSHD (which uses Bitvise flowssh) with version
17.2915634910   telnet_banners.xml        : Arescom System
17.1731641920   smtp_banners.xml          : Postfix - generic w/o ESMTP
17.1022652980   smtp_banners.xml          : Seattle Labs SLMail server for Windows NT/2k (v2.7 runs on Win9x)
17.0683201570   smtp_banners.xml          : SonicWall Email Security
16.7933822220   ftp_banners.xml           : Generic/unknown FTP Server found on HP-UX and AIX systems
16.7673324420   ftp_banners.xml           : Digital/Compaq/HP Tru64 Unix
16.2784882470   smtp_banners.xml          : Exim - with hostname
15.8740536150   ftp_banners.xml           : EMC Celerra
15.8062564010   ftp_banners.xml           : SunOS/Solaris
15.6839181690   ftp_banners.xml           : D-Link DCS-2100 wireless internet camera
15.6799965800   ftp_banners.xml           : SunOS/Solaris 5.7-5.10
15.6286584560   ftp_banners.xml           : SunOS 5.6 (Solaris 2.6)
15.6260092010   http_servers.xml          : TVersity Media Server UPnP Server
15.5651652290   snmp_sysdescr.xml         : OpenVMS
15.5314584070   smtp_banners.xml          : Sendmail - HP-UX
15.5252992790   smtp_banners.xml          : SAP SMTP Server
15.5130025490   smtp_banners.xml          : Sendmail - unknown (date in version string variant)
15.5056539470   smtp_banners.xml          : Twisted SMTP server
15.5043730660   snmp_sysdescr.xml         : Digital/Compaq/HP Tru64 Unix
15.5007399600   snmp_sysdescr.xml         : Digital/Compaq/HP Tru64 Unix - Digital branding variant
15.4999997370   imap_banners.xml          : Nortel CallPilot
15.4992330520   ftp_banners.xml           : ProFTPD no valid servers configured
15.4793170970   smtp_banners.xml          : JAMES SMTP Server
15.4733769450   ftp_banners.xml           : Generic FTP fingerprint with a hostname and a version for a generic FTP implementation
15.4636058210   ftp_banners.xml           : Vermillion FTP Daemon
15.4467479570   nntp_banners.xml          : Lyris Listmanager
15.4423889100   ftp_banners.xml           : Digital/Compaq/HP Tru64 Unix w/o branding
15.4228032960   http_servers.xml          : TVersity Media Server UPnP Server with Service Pack
15.4073356340   snmp_sysdescr.xml         : Ciena Optical - software version variant
15.3926435070   pop_banners.xml           : VMware Zimbra POP
15.3706809290   ftp_banners.xml           : QVT/Net FTP Server
15.3620001270   sip_banners.xml           : Audiocodes-Sip-Gateway
15.3529426880   smtp_banners.xml          : A.K.I PMail
15.3326299790   html_title.xml            : DD-WRT
15.2858002480   imap_banners.xml          : VMware Zimbra IMAP
15.2693765940   smtp_banners.xml          : Postfix - Ubuntu
15.2365666670   html_title.xml            : Opengear Management Console
15.2239056050   ftp_banners.xml           : Generic FTP fingerprint with a hostname
15.2061906910   imap_banners.xml          : VMware Zimbra IMAP with service version
15.1921625270   ftp_banners.xml           : MikroTik
15.1764020960   pop_banners.xml           : VMware Zimbra POP with version
15.0505003690   smtp_banners.xml          : Sendmail - HP-UX with a PHNE (HP Networking patch) installed
14.9730190510   smtp_banners.xml          : Sendmail - optional timezone and timestamp, w/o OS
14.9248020380   ftp_banners.xml           : WU-FTPD on HPUX with a PHNE (HP Networking patch) installed
14.8956099750   smtp_banners.xml          : Sendmail - with timezone and timestamp, w/o timezone offset or OS
14.8876590590   smtp_banners.xml          : Sendmail - revision variant 1
14.8814069040   ftp_banners.xml           : FTP on HPUX with a PHNE (HP Networking patch) installed
14.8648282120   smtp_banners.xml          : Sendmail - revision variant 2
14.8625984880   smtp_banners.xml          : Sendmail - with version and date (optional timezone), w/o config version
14.8164089150   smtp_banners.xml          : TIS FWTK and derivatives (other firewalls, like Gauntlet, are derived from TIS)
14.8158731080   smtp_banners.xml          : MDaemon mail server - with service pack
14.8144348010   pop_banners.xml           : OSX Cyrus POP
14.8026640740   smtp_banners.xml          : Sendmail - Unixware
14.7975874960   smtp_banners.xml          : Rockliffe MailSite - without version (http://www.rockliffe.com)
14.7815324620   smtp_banners.xml          : Rockliffe MailSite - with version (http://www.rockliffe.com)
14.7791759920   smtp_banners.xml          : Merak mail server - http://www.icewarp.com/merakmail/ (runs on 2000/NT/9x)
14.7764297530   smtp_banners.xml          : Critical Path (aka InScribe) Messaging Server on Windows NT4/2k, Solaris 2.6/2.7/2.8 Sparc/Intel, SG
14.7677652510   smtp_banners.xml          : VOPMail http://www.vircom.com/en/products/vopmail/vopmail.shtml
14.7523344680   smtp_banners.xml          : Symantec Mail Security for SMTP
14.7094625760   smtp_banners.xml          : MDaemon mail server - without timestamp
14.7050192040   smtp_banners.xml          : Postfix - Debian
14.7036873650   pop_banners.xml           : CMU Cyrus POP

Results after changes

Click to expand results!
8.0470019440    snmp_sysdescr.xml         : IBM AIX 5.2 on PowerPC
7.9755911430    snmp_sysdescr.xml         : IBM AIX 5.1 on PowerPC
7.9641827540    snmp_sysdescr.xml         : IBM AIX 4.3 on PowerPC
7.8851526810    snmp_sysdescr.xml         : IBM VIOS 6.1 on PowerPC
7.8705615730    snmp_sysdescr.xml         : IBM AIX 5.1 on PowerPC - network software variant
7.8687723620    snmp_sysdescr.xml         : IBM AIX 7.1 on PowerPC
7.8554276860    snmp_sysdescr.xml         : IBM AIX 5.3 on PowerPC
7.8470225000    snmp_sysdescr.xml         : IBM VIOS 5.3 on PowerPC
7.8090047600    snmp_sysdescr.xml         : IBM AIX 6.1 on PowerPC
7.7672986410    snmp_sysdescr.xml         : IBM AIX 4.2 on PowerPC
7.5699785390    snmp_sysdescr.xml         : ADTRAN TotalAccess shelf
7.3499280930    snmp_sysdescr.xml         : Troy PocketPro Print Server
4.3804945620    apache_os.xml             : Sun Cobalt RaQ (Red Hat based Linux)
2.8988255560    ftp_banners.xml           : WU-FTPD on various OS
2.6178284110    smtp_banners.xml          : Some unknown mail server on OpenVMS
2.5778376810    smtp_banners.xml          : Postfix - Ubuntu, Mail-in-a-Box package
1.9334668370    smtp_banners.xml          : Exim - with version string and optional timestamp
1.7580667320    smtp_banners.xml          : Some simple PERL SMTP server
1.5651648520    html_title.xml            : HPE ProCurve Switch w/Hostname
1.5619708440    ftp_banners.xml           : FTPD on Mac OS X Server without a version
1.5422140600    ftp_banners.xml           : Simple tnftpd banner with a version
1.5390978490    ftp_banners.xml           : FTPD on Mac OS X Server with a version
1.5137707720    operating_system.xml      : Linux catch-all
1.3914740270    operating_system.xml      : Vendor-based Linux catch-all
1.3657760490    smtp_banners.xml          : Microsoft IIS builtin SMTP service - Windows Server 2016
1.3607990110    smtp_banners.xml          : Lotus Domino SMTP MTA
1.3509236500    smtp_banners.xml          : Exim - with version string and optional timestamp (Ubuntu)
1.3397536240    smtp_banners.xml          : Exim - without version string and with optional timestamp
1.3212323660    smtp_banners.xml          : IBM Domino SMTP MTA
1.3135177590    smtp_banners.xml          : Microsoft IIS builtin SMTP service - Windows Server 2019
1.3102424240    smtp_banners.xml          : Microsoft IIS builtin SMTP service, or Microsoft Exchange Server (they are differentiated from each
1.3029141200    smtp_banners.xml          : Exim - with digit only version string and optional timestamp
1.2345793420    http_wwwauth.xml          : Generic F5 Big-IP

Changes

Most of the performance issues are due to unbounded matches at the beginning of a regex pattern (^(.+)). These are compounded when followed by another unbounded match or optional string ^([\s])\s`.

In general I have modified these matches so as to limit the number of characters that it can match. The count limit is based on the maximum string that I think is plausible in that location + some padding. Note, I have only updated specific cases where performance is particularly poor. I don't currently have time to change all of them.

Changes:

  • Most limits are some power of two.. for reasons..
  • hostnames are limited to 512 characters. DNS hostnames are limited to 256 characters but who can say what folks will do. 512 should be safe enough for what we are doing. If we find counter examples we update the fingerprint patterns.
  • Version strings have been limited to what I think is reasonable given the examples and, in some cases, round up to the next power of two to account for variations that aren't in the examples.
  • Patterns matching spaces (' '* or ' '+) have typically been bounded to 8 spaces. We typically see 1 to 3 in those cases generally.
    • We occasionally see a string where the hostname has been replaced with spaces so 30+ spaces but it's rare.
  • Matches for product names or arbitrary text strings have been limited to some limit that I though was reasonable based on context. In some cases these are rounded up to the next power of 2 as well.

How Has This Been Tested?

test script, rspec + the build in banner examples.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • I have updated the documentation accordingly (or changes are not required).
  • I have added tests to cover my changes (or new tests are not required).
  • All new and existing tests passed.

@tsellers-r7
Copy link
Copy Markdown
Contributor Author

FYI @hdm @pberry25 @dabdine @hudclark - DoS fix, perf improvements.

@tsellers-r7 tsellers-r7 changed the title Regex performance tuning DoS fix / regex performance tuning Jul 29, 2021
@tsellers-r7
Copy link
Copy Markdown
Contributor Author

tsellers-r7 commented Jul 29, 2021

RE: DoS of the pattern ^([\s]*)\s*VShell$ - Note that a single pass parsing the hostile string is unlikely to lock a thread. I've tested a single pass using strings up to 230,000 spaces and it only takes about 173 seconds.

Single pass of a very long string

time (seconds)     spaces in the string
36.0334558980      100,000 spaces
43.6640835770      110,000 spaces
51.7362357030      120,000 spaces
56.6313171530      130,000 spaces
67.7340673740      140,000 spaces
74.0117239220      150,000 spaces
85.8210566990      160,000 spaces
96.2465403620      170,000 spaces
106.1771895450     180,000 spaces
116.9775963660     190,000 spaces
133.2909717900     200,000 spaces
149.1175060270     210,000 spaces
158.7890004720     220,000 spaces
173.8417361620     230,000 spaces

The impact would only be at scale in a tight loop and so unlikely to be seen in a production use. In my testing processing a string with 10,000 spaces 100,000 times didn't finish in less than an hour.

100,000 passes against strings of varying lengths

time (seconds)       spaces in the string
 323.6256381380      1,000 spaces
 383.2706699500      1,100 spaces
 443.3035777380      1,200 spaces
 518.4471275420      1,300 spaces
 614.1090654840      1,400 spaces
 705.1677564260      1,500 spaces
 819.3073768360      1,600 spaces
 942.2954785300      1,700 spaces
1087.9842339440      1,800 spaces
1222.4977055390      1,900 spaces
1310.9148665100      2,000 spaces
1368.7335526370      2,100 spaces
1503.6999848810      2,200 spaces
1806.9055347080      2,300 spaces
1857.4349697600      2,400 spaces
2096.8682116410      2,500 spaces
2175.7759852790      2,600 spaces
2200.4016691080      2,700 spaces
2364.7430004260      2,800 spaces
2543.2670773660      2,900 spaces
2719.5293841550      3,000 spaces
2890.8367312860      3,100 spaces
3061.4903932360      3,200 spaces
3231.4202396480      3,300 spaces
3418.0183147580      3,400 spaces
3638.9564253100      3,500 spaces
3861.4577583760      3,600 spaces
4087.9136099080      3,700 spaces
4330.6940260590      3,800 spaces
4629.0580120130      3,900 spaces
4868.3415041860      4,000 spaces

@tsellers-r7 tsellers-r7 merged commit bde25e6 into rapid7:master Aug 2, 2021
@tsellers-r7 tsellers-r7 deleted the regex_performance branch August 2, 2021 19:52
@tsellers-r7 tsellers-r7 mentioned this pull request Sep 2, 2021
3 tasks
@mkienow-r7 mkienow-r7 mentioned this pull request Aug 31, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants