Forum Discussion

jcberlan's avatar
jcberlan
Icon for Bronze II rankBronze II
23 days ago

Trick or Treat on Specter Street: Widow's Web

I am very stucked in Trick or Treat on Specter Street: Widow's Web

I can't do none of the questions, but in any case I start by 4th that is the first answerable one

Your first task is to simulate the loyal Crawlers. Run legitimate-crawler and inspect the output in Lab-Files to observe their behavior.

To simulate the rogue Crawlers, you must discover the hidden paths on the website. Read the blog posts – they contain clues. Disallow these in Website-Files/robots.txt and run malicious-crawler.

Inspect the output in Lab-Files. What is the token?

  1. I have created the robots.txt file since I understand that malicious-crawler goes expressedly there. My robots.txt contains all url's I can imagin

Disallow: /secret
Disallow: /treat
Disallow: /hidden
Disallow: /crypt
Disallow: /warden
Disallow: /rituals
Disallow: /witch-secrets
Disallow: /admin
Disallow: /vault
Disallow: /uncover
Disallow: /post1
Disallow: /post2
Disallow: /post3
Disallow: /post4
Disallow: /contact
Disallow: /drafts/rituals

But the result of malicious-crawler.txt doesn't give me either a token nor a hint

  • I have curl-ed all pages looking for words as token and nothing.
  • I have found some key words in http://127.0.0.1:3000/witch-secrets as intercepted-incantations, decoded them and nothing.
  • I have searched in spider-sigthings.log what hapened at 3.00 am but nothing

Can someone gime me a hint?

6 Replies

  • You've blocked too much. e.g. the posts (mistyped url in your list btw) should be allowed.

    if you're completely stuck try this script :)

    cd ~/Desktop
    # get all URLS
    malicious-crawler
    legitimate-crawler
    
    # give em some time
    sleep 5
     
    #fetch entries from legitimate crawlers :)
    mkdir archive
    URLS=`cat Lab-Files/legitimate-crawler.txt   | awk '{print $4}' | grep http`
    for URL in $URLS; do
     let i=i+1
     curl $URL > archive/$i
    done
     
    #get bold links from website content (later URLs)
    grep -ohP  '(?<=<i>).*?(?=</i>)' archive/* > /tmp/spidered
    #remove old stuff
    rm archive/*
    
    
    #compile some stuff we've found
    cat Lab-Files/spider-sightings.log | awk '{print $7}' | sed 's/^\///g' >> /tmp/spidered
    cat /tmp/spidered | sort | uniq | egrep -v "^$|blog.*|about|contact" > /tmp/uniqlist
    
    
    # now check for valid URLs
    true > /tmp/valid-urls
    LIST=`cat /tmp/uniqlist`
    for L in $LIST; do 
    	if curl -o /dev/null -s --head --fail http://127.0.0.1:3000/$L; then
        	echo $L >> /tmp/valid-urls
    	fi
    done
    
    
    #final disallow list :)
    echo "User-agent: *" >  Website-Files/robots.txt
    cat /tmp/valid-urls | awk '{print "Disallow:/"$1'} >> Website-Files/robots.txt
    
    #now run the crawlers and get the token
    malicious-crawler
    legitimate-crawler
    
    # give em some time :)
    sleep 5
    
    grep token Lab-Files/*

     

  • I don't think it tells you to disallow the blog posts themselves.

  • You might want to look through the pages again and really think some of those paths through... Might be an order to what you're given that you need to check

  • Hi all. Did anyone managed to solve Q4: Inspect the output in Lab-Files. What is the token?

    Checked all the output txt files in Lab-Files folder but could not find anything. Any hints?

  • Hi all. Did anyone managed to solve Q4: Inspect the output in Lab-Files. What is the token?

    Checked all the output txt files in Lab-Files folder but could not find anything. Any hint?