Word Counting, Parsing, and Practical Qualitative Analysis: Python and Google/Java Scripts

Image 1

Imagine you’re running a business that receives 1,000 customer reviews, 2,000 product feedback emails, and 500 social media comments every single week. The volume is overwhelming. Manually reading through them is impossible, but missing key trends could mean losing customers, market share, or reputation.

At Brilliant Supply Chain, we believe that understanding unstructured text data quickly is a modern business advantage. One of the simplest, most practical first steps toward that goal is word counting and parsing.

In this article, we’ll explore how businesses can apply basic word frequency analysis using Python and Google Apps Script to make sense of text data — fast, efficiently, and practically.

What is Word Counting and Why Is It Useful?

Image 2

When faced with massive volumes of customer feedback, social media chatter, or product reviews, reading every word line-by-line is inefficient.

Word frequency analysis solves this problem by:

  • Splitting the text into individual words.
  • Counting how often each word occurs.
  • Ignoring common “stop words” like the, is, and.
  • Highlighting meaningful words and dominant patterns.

Why businesses should care:

  • Quickly detect recurring problems (“delivery,” “refunds,” “quality”).
  • Identify emerging trends without human bias.
  • Prioritize resources based on real customer conversations.

This simple technique forms the foundation for deeper qualitative analysis, marketing intelligence, and customer satisfaction programs.

Practical Applications of Word Counting

Image 3
Industry/Function Practical Use
Retail Business Analyze 10,000+ reviews to prioritize product improvements.
Marketing Agency Monitor brand sentiment across social media mentions.
Education Sector Summarize student feedback from course evaluations.
Manufacturing Firm Detects recurring machinery issues from maintenance reports.
E-Commerce Identify common complaints in customer support tickets.

These use cases show that whether you’re in marketing, operations, or customer service, word counting gives you the first snapshot into what matters most to your audience.

How to Implement Word Counting

Image 4


Word Counting with Python

Python is a lightweight, powerful language ideal for text processing. We use libraries like re for cleaning text and Counter for counting word frequencies.

Image 5

Python

from collections import Counter


import re
from collections import Counter

# Step 1: Read the document
def read_document(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        return file.read()

# Step 2: Clean and split text into words
def tokenize(text):
    # Remove special characters and numbers, keep only words
    text = re.sub(r'[^a-zA-Z\\s]', '', text)
    words = text.lower().split()
    return words

# Step 3: Remove common stop words (This is an optional part that can be omitted.)
def remove_stopwords(words):
    stop_words = set([
        'the', 'is', 'and', 'a', 'to', 'in', 'of', 'for', 'on', 'with', 'as',
        'at', 'this', 'that', 'an', 'it', 'be', 'by', 'are', 'from', 'or',
        'was', 'but', 'not', 'have', 'has', 'had'
    ])
    filtered_words = [word for word in words if word not in stop_words]
    return filtered_words

# Step 4: Count word frequencies
def count_words(words):
    return Counter(words)

# Step 5: Main function to run everything
def main():
    file_path = 'sample_document.txt'  # Change this to your document's path
    text = read_document(file_path)
    words = tokenize(text)
    words = remove_stopwords(words)
    word_counts = count_words(words)

    # Display the most common words
    for word, count in word_counts.most_common():
        print(f\"{word}: {count}\")

# Call the main function
main()

💡 Example: Imagine parsing 5,000 customer reviews overnight with this script to find that “refund,” “late delivery,” and “packaging” were the top issues.

Word Counting with Google Apps Script

If your team uses Google Workspace, Apps Script provides a simple, serverless way to automate word counting inside Google Docs.

Java Script

function wordFrequencyCounter() {
  // Step 1: Read the document (replace with your own Google Doc ID)
  var fileId = 'YOUR_GOOGLE_DOC_FILE_ID';  // <--- Change this
  var doc = DocumentApp.openById(fileId);
  var text = doc.getBody().getText();

  // Step 2: Clean and split text into words
  var words = tokenize(text);

  // Step 3: Remove common stop words
  words = removeStopWords(words);

  // Step 4: Count word frequencies
  var wordCounts = countWords(words);

  // Step 5: Display the word frequencies in the Logger
  for (var word in wordCounts) {
    Logger.log(word + ": " + wordCounts[word]);
  }
}

// Function to clean text and split into words
function tokenize(text) {
  text = text.replace(/[^a-zA-Z\s]/g, '');  // Remove special characters and numbers
  text = text.toLowerCase();                // Convert to lowercase
  var words = text.split(/\s+/);             // Split by spaces
  return words;
}

// Function to remove stop words
function removeStopWords(words) {
  var stopWords = new Set([
    'the', 'is', 'and', 'a', 'to', 'in', 'of', 'for', 'on', 'with', 'as',
    'at', 'this', 'that', 'an', 'it', 'be', 'by', 'are', 'from', 'or',
    'was', 'but', 'not', 'have', 'has', 'had'
  ]);
  var filteredWords = [];
  for (var i = 0; i < words.length; i++) {
    if (!stopWords.has(words[i]) && words[i] !== '') {
      filteredWords.push(words[i]);
    }
  }
  return filteredWords;
}

// Function to count word frequencies
function countWords(words) {
  var wordCounts = {};
  for (var i = 0; i < words.length; i++) {
    var word = words[i];
    if (wordCounts[word]) {
      wordCounts[word]++;
    } else {
      wordCounts[word] = 1;
    }
  }
  return wordCounts;
}

💡 Example: Without installing any libraries, your marketing team can summarize 1,000 open-ended survey responses directly from Google Docs.

Word Counting with SQL + Python

For businesses storing feedback in databases or ERP systems, combining SQL to fetch data and Python to analyze it creates a powerful, scalable solution.

Python + SQL

import sqlite3
import re
from collections import Counter

# Step 1: Connect to the database (creates one if it doesn't exist)
conn = sqlite3.connect('sample_comments.db')
cursor = conn.cursor()

# Step 2: Create a table and insert some example comments
cursor.execute('''
CREATE TABLE IF NOT EXISTS comments (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    comment_text TEXT
)
''')

# Insert some sample data (you can skip this part if your table already has data)
sample_comments = [
    ("The product is great and very useful",),
    ("This is an amazing product but quite expensive",),
    ("Useful and affordable",),
    ("Product is not useful for professional use",)
]
cursor.executemany('INSERT INTO comments (comment_text) VALUES (?)', sample_comments)
conn.commit()

# Step 3: Fetch all comments from the database
cursor.execute('SELECT comment_text FROM comments')
rows = cursor.fetchall()

# Step 4: Combine all comments into a single text blob
text = ' '.join(row[0] for row in rows)

# Step 5: Tokenize - clean and split into words
def tokenize(text):
    text = re.sub(r'[^a-zA-Z\s]', '', text)  # Remove punctuation
    words = text.lower().split()
    return words

# Step 6: Remove stop words
def remove_stopwords(words):
    stop_words = set([
        'the', 'is', 'and', 'a', 'to', 'in', 'of', 'for', 'on', 'with', 'as',
        'at', 'this', 'that', 'an', 'it', 'be', 'by', 'are', 'from', 'or',
        'was', 'but', 'not', 'have', 'has', 'had'
    ])
    return [word for word in words if word not in stop_words]

# Step 7: Count word frequencies
words = tokenize(text)
filtered_words = remove_stopwords(words)
word_counts = Counter(filtered_words)

# Step 8: Print results
print("\nWord Frequency List:")
for word, count in word_counts.most_common():
    print(f"{word}: {count}")

# Close the database connection
conn.close()

💡 Example: Pull all customer support complaints from your database and instantly discover that “returns” and “damaged” are the most common words.

Comparison of Tools

Image 6

Conclusion

Word counting isn’t the final solution for deep analytics, but it’s the critical first step. It allows businesses to:

Save time.

Surface hidden trends.

Set the stage for deeper insights (sentiment analysis, topic modeling, NLP).

At Brilliant Supply Chain, we don’t just share tools — we show you practical strategies for turning data into decisions.

Subscribe to Brilliant Supply Chain Newsletters to learn how to move beyond word counts into full-scale business intelligence, predictive analytics, and real-world supply chain transformations.

Your data is speaking. Are you listening?

 

Scroll to Top