Forum Discussion

Bronze II

6 months ago

Solved

Help with Introduction to Python Scripting: Ep.7 – Demonstrate Your Skills

Hello all,

I am stuck with the last question on this Immersive lab . Below is my question

Using Python, build a web scraper to scrape the website for 12-digit phone numbers beginning with + (e.g., +123456789012). The requests and BeautifulSoup4 (BS4) libraries are available to you. How many extracted phone numbers are returned?

I created the following python script

import requests
from bs4 import BeautifulSoup
import re

url = "http://10.102.35.108:4321"
try:
response = requests.get(url)
response.raise_for_status() # Raise an exception for bad status codes
except requests.exceptions.RequestException as e:
print(f"Error fetching the page: {e}")
exit()
soup = BeautifulSoup(response.text, 'html.parser')
phone_pattern = r"\+\d{12}"
found_numbers = re.findall(phone_pattern, soup.get_text())
num_found = len(found_numbers)
print(f"Found {num_found} phone numbers:")
for number in found_numbers:
print(number)

The value is 0, but I am getting an incorrect solution. please help

challenges

cyber team simulations

help & support

immersive labs

netcat
6 months ago
Perfect code. However, the answer field expects you to count also duplicates. I didn't notice this in the first place, because I did just a wget, then grep "+" and counted manually.

Generally, it's best to get a local copy first (that can be analyzed manually), and then implement automatic analysis. If something goes wrong, you can always take a look at the local copy.

6 Replies

DCadet
Bronze II
6 months ago
The URL I use was the following
url = "http://10.102.35.108:801"
- netcat
  Advocate
  6 months ago
  You're supposed to download the web page recursively, i.e. all pages as well, e.g. chariry.html etc.
  On the main page there are no phone numbers, that's right. On the other pages there are a few.
  - DCadet
    Bronze II
    6 months ago
    Thank you for your insight, I got 14, but it's still wrong
    
    import requests
    from bs4 import BeautifulSoup
    import re
    from urllib.parse import urljoin
    visited = set()
    phone_numbers = set()
    def scrape(url):
    if url in visited:
    return
    visited.add(url)
    print(f"Scraping {url}")
    try:
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'html.parser')
    # Find 12-digit numbers starting with +
    matches = re.findall(r'\+\d{12}', r.text)
    phone_numbers.update(matches)
    # Recurse into internal links
    for link in soup.find_all('a', href=True):
    full_url = urljoin(url, link['href'])
    if full_url.startswith(url): # ensure we stay within the same site
    scrape(full_url)
    except Exception as e:
    print(f"Error scraping {url}: {e}")
    start_url = f"http://10.102.50.198:{801}/"
    scrape(start_url)
    print("Extracted phone numbers:")
    for number in phone_numbers:
    print(number)
    print(f"Total unique phone numbers found: {len(phone_numbers)}")

Forum Discussion

Help with Introduction to Python Scripting: Ep.7 – Demonstrate Your Skills

6 Replies

Featured Places

Help

Related Content

Introduction to Metasploit: Ep.9 – Demonstrate Your Skills

Introduction to Aircrack-ng: Ep.8 – Demonstrate Your Skills

Cross-Site Scripting: Ep.6 – Further Exploitation

Discovery: Enumeration Scripts – Part 1

Zeek Ep 4 Scripting

Recent Discussions

Kate's Story: Ep.1 – Gathering Intelligence - Questions 8,9

Can I format the text in an interlude with HTML?

CSP Hash Incorrect Despite Correct Script and Hash (CSP Lab Issue?)

the task is not completed

Active Directory Basics: Demonstrate Your Skills