Forum Discussion
Help with Introduction to Python Scripting: Ep.7 – Demonstrate Your Skills
Hello all,
I am stuck with the last question on this Immersive lab . Below is my question 
Using Python, build a web scraper to scrape the website for 12-digit phone numbers beginning with + (e.g., +123456789012). The requests and BeautifulSoup4 (BS4) libraries are available to you. How many extracted phone numbers are returned?
I created the following python script 
import requests
from bs4 import BeautifulSoup
import re
url = "http://10.102.35.108:4321" 
try:
    response = requests.get(url)
    response.raise_for_status()  # Raise an exception for bad status codes
except requests.exceptions.RequestException as e:
    print(f"Error fetching the page: {e}")
    exit()
soup = BeautifulSoup(response.text, 'html.parser')
phone_pattern = r"\+\d{12}" 
found_numbers = re.findall(phone_pattern, soup.get_text()) 
num_found = len(found_numbers)
print(f"Found {num_found} phone numbers:")
for number in found_numbers:
    print(number) 
The value is 0, but I am getting an incorrect solution. please help 
- Perfect code. However, the answer field expects you to count also duplicates. I didn't notice this in the first place, because I did just a wget, then grep "+" and counted manually. 
 Generally, it's best to get a local copy first (that can be analyzed manually), and then implement automatic analysis. If something goes wrong, you can always take a look at the local copy.
6 Replies
- DCadetBronze II The URL I use was the following 
 url = "http://10.102.35.108:801"- netcatSilver III You're supposed to download the web page recursively, i.e. all pages as well, e.g. chariry.html etc. 
 On the main page there are no phone numbers, that's right. On the other pages there are a few.- DCadetBronze II Thank you for your insight, I got 14, but it's still wrong import requests 
 from bs4 import BeautifulSoup
 import re
 from urllib.parse import urljoinvisited = set() 
 phone_numbers = set()def scrape(url): 
 if url in visited:
 return
 visited.add(url)
 print(f"Scraping {url}")
 try:
 r = requests.get(url)
 soup = BeautifulSoup(r.text, 'html.parser')# Find 12-digit numbers starting with + 
 matches = re.findall(r'\+\d{12}', r.text)
 phone_numbers.update(matches)# Recurse into internal links 
 for link in soup.find_all('a', href=True):
 full_url = urljoin(url, link['href'])
 if full_url.startswith(url): # ensure we stay within the same site
 scrape(full_url)
 except Exception as e:
 print(f"Error scraping {url}: {e}")start_url = f"http://10.102.50.198:{801}/" 
 scrape(start_url)print("Extracted phone numbers:") 
 for number in phone_numbers:
 print(number)print(f"Total unique phone numbers found: {len(phone_numbers)}")