Forum Discussion
jcberlan
Bronze II
23 days agoTrick or Treat on Specter Street: Widow's Web
I am very stucked in Trick or Treat on Specter Street: Widow's Web I can't do none of the questions, but in any case I start by 4th that is the first answerable one Your first task is to simulate t...
steven
Silver II
5 days agoYou've blocked too much. e.g. the posts (mistyped url in your list btw) should be allowed.
if you're completely stuck try this script :)
cd ~/Desktop
# get all URLS
malicious-crawler
legitimate-crawler
# give em some time
sleep 5
#fetch entries from legitimate crawlers :)
mkdir archive
URLS=`cat Lab-Files/legitimate-crawler.txt | awk '{print $4}' | grep http`
for URL in $URLS; do
let i=i+1
curl $URL > archive/$i
done
#get bold links from website content (later URLs)
grep -ohP '(?<=<i>).*?(?=</i>)' archive/* > /tmp/spidered
#remove old stuff
rm archive/*
#compile some stuff we've found
cat Lab-Files/spider-sightings.log | awk '{print $7}' | sed 's/^\///g' >> /tmp/spidered
cat /tmp/spidered | sort | uniq | egrep -v "^$|blog.*|about|contact" > /tmp/uniqlist
# now check for valid URLs
true > /tmp/valid-urls
LIST=`cat /tmp/uniqlist`
for L in $LIST; do
if curl -o /dev/null -s --head --fail http://127.0.0.1:3000/$L; then
echo $L >> /tmp/valid-urls
fi
done
#final disallow list :)
echo "User-agent: *" > Website-Files/robots.txt
cat /tmp/valid-urls | awk '{print "Disallow:/"$1'} >> Website-Files/robots.txt
#now run the crawlers and get the token
malicious-crawler
legitimate-crawler
# give em some time :)
sleep 5
grep token Lab-Files/*