is_sql_injectable(domain)

or let's protect websites whose username tables are up for grabs

By Pascal van Kooten

MSc Methods&Statistics

Working at Jibes Data Analytics

Open source projects:

yagmailsend emails in 2 lines (html/attach)246
skynext-gen intelligent web scraping57
gittyleaksfind users/keys/pass in git repos18
pytrendingdiscover trending python10
xtoyautomatic prep/model/predict 2

Interesting for python because:

  1. "big data"
  2. cloud
  3. scraping
  4. sending email

Demo

http://www.gtbit.org/news/viewitem.php?id=40

{"domain": "http://www.gtbit.org",
 "url": "http://www.gtbit.org/news/viewitem.php?id=40",
 "injectable": true,
 "on line": true,
 "error": false,
 "at line": false,
 "time": "Wed Oct 28 00:59:39 2015",
 "warning": true,
 "failed_request": false,
 "emails": ["gtbit@rediffmail.com", "inderjeet@gmail.com"],
 "sql": true}
                    
What could a hacker do with this?
  • He could use sqlmap (written in Python) to figure out which tables are in the database.
  • He could then obtain the username table
  • He could then do password attacks on the website using the usernames
  • Possibly find other useful information in the tables

Testing injectability

CommonCrawl
ActionAmount
Web data of 145TB1.81 billion
URL contains "php?"109.715
Keep only unique domains27.046
Append single quote
Test for SQL errors on page1.742
if error: scan homepage + contact for email692

Scripts:

  1. aws_master.py
  2. aws_client.py
  3. ec2_template.py
  4. injectable.py
part = r'[^?@ ><\'":\\\/]+'
email_re = re.compile(part + '@' + part + r'\.' + part)
                    


for wet_path in wetpaths:
    swp = slugger(wet_path)
    if swp in dones:
        continue
    t1 = time.time()
    results = []
    # Start a connection to one of the WARC files
    k = Key(pds, wet_path)
    f = warc.WARCFile(fileobj=GzipStreamFile(k))
    for i, record in enumerate(f):
        if record.url is not None and 'php?id=' in record.url:
            results.append(record.url)
    print(time.time() - t1)
    save_file_s3('\n'.join(results), swp)
                    

Crucial lessons:

  1. Typical map+reduce problem
  2. Better to spend a little than put extra time
  3. We need more distributed stuff...

Questions?

Should I email?

kootenpv.github.io
PascalvKooten
kootenpv
kootenpv@gmail.com
pascalvkooten