What are we doing this week?

This week we are going to look at how the FILE ANALYSIS part of GOOGLE SEARCH in PYTHON. We'll look at CHARACTER COUNT, WORD COUNT, LINE COUNT in PYTHON. We'll look at how to measure WORD FREQUENCY and look at full FILE ANALYSIS in PYTHON. We are also going to look at how to open URLs and join URLs in PYTHON. We are also going to look at the code for a WEB CRAWLER in PYTHON.
Powerpoint: File Analysis

Powerpoint: Web Crawling

Powerpoint: More on Content-Type

Total running time of videos is 55 minutes.

Matt Cutts: How Search Works

Eli Pariser: Beware online "filter bubbles"

Python Page Spider Web Crawler Tutorial

Scrape Websites with Python + Beautiful Soup 4 + Requests

Think Python: Word Frequency Analysis

Learn Python the Hard Way: Dictionaries, Oh Lovely Dictionaries

Python Docs: Brief Tour of the Standard Library

Scrapy - A Fast and Powerful Scraping and Web Crawling Program

HTML Scraping - The Hitchhiker's Guide to Python

Online Interpreter:
Python Intrepreter
Sample Code:
String Pre-Processing * File Statistics * Word Frequency * Full File Analysis
Sample Files:

Replacing a word in a file:
File Word Replace * Input_file.txt
Filtering a String:
Sample String Filtering

URL Open * URL Join * Web Crawler

More on the HTML Content-Type:

Powerpoint: More on Content-Type

Content Type Checker * HTMLCheckerImproved URL Open * Improved Web Crawler

Lab #2
Lab #2 is about adding options to the FULL FILE ANALYSIS program, and
about adding options to the WEB CRAWLER program.


