Forum Discussion
Trick or Treat on Specter Street: Widow's Web
I am very stucked in Trick or Treat on Specter Street: Widow's Web
I can't do none of the questions, but in any case I start by 4th that is the first answerable one
Your first task is to simulate the loyal Crawlers. Run legitimate-crawler and inspect the output in Lab-Files to observe their behavior.
To simulate the rogue Crawlers, you must discover the hidden paths on the website. Read the blog posts – they contain clues. Disallow these in Website-Files/robots.txt and run malicious-crawler.
Inspect the output in Lab-Files. What is the token?
- I have created the robots.txt file since I understand that malicious-crawler goes expressedly there. My robots.txt contains all url's I can imagin
Disallow: /secret
Disallow: /treat
Disallow: /hidden
Disallow: /crypt
Disallow: /warden
Disallow: /rituals
Disallow: /witch-secrets
Disallow: /admin
Disallow: /vault
Disallow: /uncover
Disallow: /post1
Disallow: /post2
Disallow: /post3
Disallow: /post4
Disallow: /contact
Disallow: /drafts/rituals
But the result of malicious-crawler.txt doesn't give me either a token nor a hint
- I have curl-ed all pages looking for words as token and nothing.
- I have found some key words in http://127.0.0.1:3000/witch-secrets as intercepted-incantations, decoded them and nothing.
- I have searched in spider-sigthings.log what hapened at 3.00 am but nothing
Can someone gime me a hint?
6 Replies
- steven
Silver II
You've blocked too much. e.g. the posts (mistyped url in your list btw) should be allowed.
if you're completely stuck try this script :)cd ~/Desktop # get all URLS malicious-crawler legitimate-crawler # give em some time sleep 5 #fetch entries from legitimate crawlers :) mkdir archive URLS=`cat Lab-Files/legitimate-crawler.txt | awk '{print $4}' | grep http` for URL in $URLS; do let i=i+1 curl $URL > archive/$i done #get bold links from website content (later URLs) grep -ohP '(?<=<i>).*?(?=</i>)' archive/* > /tmp/spidered #remove old stuff rm archive/* #compile some stuff we've found cat Lab-Files/spider-sightings.log | awk '{print $7}' | sed 's/^\///g' >> /tmp/spidered cat /tmp/spidered | sort | uniq | egrep -v "^$|blog.*|about|contact" > /tmp/uniqlist # now check for valid URLs true > /tmp/valid-urls LIST=`cat /tmp/uniqlist` for L in $LIST; do if curl -o /dev/null -s --head --fail http://127.0.0.1:3000/$L; then echo $L >> /tmp/valid-urls fi done #final disallow list :) echo "User-agent: *" > Website-Files/robots.txt cat /tmp/valid-urls | awk '{print "Disallow:/"$1'} >> Website-Files/robots.txt #now run the crawlers and get the token malicious-crawler legitimate-crawler # give em some time :) sleep 5 grep token Lab-Files/* - neeemu
Bronze III
I don't think it tells you to disallow the blog posts themselves.
- Samh051
Bronze II
You only need to disallow the "hidden paths"
- Irish
Bronze I
You might want to look through the pages again and really think some of those paths through... Might be an order to what you're given that you need to check
- PRABAKARANRAMAMURTHY
Bronze III
Hi all. Did anyone managed to solve Q4: Inspect the output in Lab-Files. What is the token?
Checked all the output txt files in Lab-Files folder but could not find anything. Any hints?
- PRABAKARANRAMAMURTHY
Bronze III
Hi all. Did anyone managed to solve Q4: Inspect the output in Lab-Files. What is the token?
Checked all the output txt files in Lab-Files folder but could not find anything. Any hint?