HTTP GET request in Python

Posted: 28th September 2014 by Tim in Python
Tags: , , , , , ,

There are a number of ways to make a GET request in Python, but the easiest (in my opinion) is via urllib2. With this library, you can make a request with only one line of code, storing the result for use later.

For example:

import urllib2
data = urllib2.urlopen("http://timmurphy.org").read()
print "Website size (bytes): " + str(len(data))

Will print:

Website size (bytes): 114209




HTML link coloring with CSS

Posted: 12th September 2014 by Tim in CSS, HTML
Tags: , , , , , , , , ,

HTML link colors can be changed easily using CSS. The properties which can be set are:

  • link – a link to a page which has not been visited
  • visited – a visited link
  • hover – a link which has the mouse hovering over it
  • active – a link which is being clicked (mouse button held down)

Each of these properties can be set like any other CSS tag, so to change the color we can use the color tag.

For example, the following CSS:

#coloredlink a:link { color: blue; }
#coloredlink a:visited { color: red; }
#coloredlink a:hover { color: black; }
#coloredlink a:active { color: green; }

with the following HTML:

<div id="coloredlink">
<a href="http://timmurphy.org/2010/02/28/my-first-latex-document/" target="_blank">LaTeX Tutorial</a>
</div>

will produce this link:

Passing function pointers as a parameter to another function can be tedious work. The function pointer definitions can be long and cumbersome to write, and obscure to read. Using pointers to member functions can be even more ambiguous. Fortunately, we can leverage the power of templates to make this work easier for us by making the compiler figure out the function pointer type for us. If we take this a step further, we can pass in both the object and it’s method as templated parameters, effectively allowing us to use the same code for multiple objects and multiple methods.

Consider the following code:

#include <iostream>

struct MyFuncs
{
    int smallInt() { return 1; }
    unsigned long bigULong() { return 1000; }
};

class OtherFuncs
{
public:
    unsigned long aNumber() { return 555; }
};

template <typename OBJECT, typename FUNC>
long long getNum(OBJECT obj, FUNC getNumber)
{
    return (obj.*getNumber)();
}

int main()
{
    MyFuncs funcs;
    std::cout << "small: " << getNum(funcs, &MyFuncs::smallInt) << std::endl;
    std::cout << "big: " << getNum(funcs, &MyFuncs::bigULong) << std::endl;

    OtherFuncs other;
    std::cout << "other: " << getNum(other, &OtherFuncs::aNumber) << std::endl;

    return 0;
}

The code above prints the following output:

small: 1
big: 1000
other: 555

This program creates a getNum(...) function which can take any object and that object’s (public) method which takes no arguments and returns a value which can be cast to a long long. If written without templates, we would need to write separate getNum(...) methods for each of the calls in main. Templates allow us to avoid such code duplication while keeping the code manageable.

Creating bar charts with gnuplot

Posted: 11th August 2014 by Tim in Gnuplot
Tags: , , , , ,

Bar charts are very easy to create with gnuplot. Very little setup is required; just a data file with labels in one column and data in another. From here, the graph can be drawn with the following line:

plot <data_file> using <label_column>:xtic(<value_column>) with boxes

For example, the following two files:

barchart.gnuplot

set terminal pngcairo font "arial,10" size 500,500
set output 'barchart.png'
set boxwidth 0.75
set style fill solid
set title "Population of Australian cities (millions), as of June 2012"
plot "population.dat" using 2:xtic(1) with boxes

population.dat

Adelaide    1.277174
Brisbane    2.189878
Canberra    0.374658
Darwin      0.131678
Hobart      0.216959
Melbourne   4.246345
Sydney      4.667283

Will create this graph:

gnuplot bar chart

If you have a JAR file and want to print the details from MANIFEST.MF, this can be done with one command in linux, using the unzip utility. For example:

$ unzip -p /usr/share/java/hsqldb.jar META-INF/MANIFEST.MF
Manifest-Version: 1.0
Created-By: 1.7.0_03-b147 (Oracle Corporation)
Specification-Title: HSQLDB
Implementation-Title: Standard runtime
Class-Path: /usr/share/java/servlet-api-3.0.jar
Main-Class: org.hsqldb.util.SqlTool
Ant-Version: Apache Ant 1.8.2
Implementation-Vendor: buildd
Implementation-Version: private-2012/07/12-02:29:31
Specification-Version: 1.8.0.10
Specification-Vendor: The HSQLDB Development Group

PHP is able to communicate with PostgreSQL databases using some relatively simple calls. In a similar manner to other database systems, the script needs to do the following:

  1. connect to the database using pg_connect
  2. execute queries using pg_query and pg_free_result
  3. close the database connection using pg_close

For example, consider the following script:

<?PHP
// database connection
$dbhost = "localhost";
$dbname = "everyone";
$dbuser = "phptest";
$dbpass = "testpassword";
$db = pg_connect("host=$dbhost dbname=$dbname user=$dbuser password=$dbpass")
    or die("Could not connect to database $dbname on host $dbhost!");

// execute the SQL query
$query = "SELECT lanname, lanpltrusted FROM pg_language;";
$result = pg_query($query)
    or die ("Query failed: " . pg_last_error());

// print the results
echo "<table style=\"width: 500px; border: 1px black solid;\">\n";
echo "\t<tr>\n";
echo "\t\t<th>Language</th>\n";
echo "\t\t<th>Trusted</th>\n";
echo "\t</tr>\n";

while ($row = pg_fetch_array($result, NULL, PGSQL_ASSOC))
{
    echo "\t<tr>\n";
    echo "\t\t<td>${row['lanname']}</td>\n";
    echo "\t\t<td>${row['lanpltrusted']}</td>\n";
    echo "\t</tr>\n";
}

echo "</table>\n";

// clean up
pg_free_result($result);
pg_close($db);
?>

This script fetches all languages supported by this PostgreSQL installation, and notes whether it is ‘trusted’ (ie: whether non-superusers can create scripts using that language). The script above will generate the following HTML table:

Language Trusted
internal f
c f
sql t
plpgsql t

Note that this code does not only work for web environments; it can be used for standalone PHP scripts too.

LaTeX style (.sty) files

Posted: 27th June 2014 by Tim in LaTeX
Tags: , , , , , ,

When writing LaTeX documents, you may find yourself copying and pasting some common settings such as margins, fonts and paragraph indentation. This is not only tedious, it can be a real headache if you’re writing multiple documents that you want to look the same. To solve this problem, you can use a style (.sty) file.

A style file uses the same syntax as a LaTeX file, but uses the .sty suffix. To use this file in your LaTeX document, load it using the \usepackage{<filename_without_sty_suffix>} syntax.

Consider the following two files:

timstyle.sty

% timstyle.sty
% This file contains common document settings

% Page margins (2cm wider, 2cm longer)
\addtolength{\textwidth}{2cm}
\addtolength{\hoffset}{-1cm}
\addtolength{\textheight}{2cm}
\addtolength{\voffset}{-1cm}

% Font (Times New Roman)
\usepackage{times}

% No paragraph indentation
\setlength{\parindent}{0in}

example.tex

\documentclass[11pt, a4paper]{article}
\usepackage{timstyle} % note: no .sty suffix here
\begin{document}
Hello World! This is the first paragraph in the document.
The paragraph is not very long, but it spans multiple lines.
As you can see, the first line of the paragraph is not indented.\\

This is the second paragraph. Look --- still no indentation!
\end{document}

The code above will produce this document. As you can see, the margin, font and paragraph indentation settings are in timstyle.sty, which simplifies the example.tex file. The style file can now be reused in other files too.

Command Line Arguments in Bash

Posted: 14th June 2014 by Tim in Bash, Linux
Tags: , , , , ,

In Bash, arguments passed in on the command line are stored in numbered variables. For example, the first argument is $1, the second argument is $2, and so on. The total number of arguments passed to the program is stored in $#

$0 contains the path to the program. This path may be an absolute path or a relative path, depending on how you called the script. $@ and $* will return all of the arguments passed to the program.

For example:

#/bin/bash
echo "Execution command: '$0 $@' ($# args)"
echo "First 3 arguments:"
if [ $# -ge 1 ]
then
    echo "  \$1 = $1"
fi
if [ $# -ge 2 ]
then
    echo "  \$2 = $2"
fi
if [ $# -ge 3 ]
then
    echo "  \$3 = $3"
fi

This script will print this if called using a relative path:
Execution command: './command_line_args.sh one two three' (3 args)
First 3 arguments:
  $1 = one
  $2 = two
  $3 = three

Or, if called using an absolute path:
Execution command: '/tmp/command_line_args.sh ichi ni san' (3 args)
First 3 arguments:
  $1 = ichi
  $2 = ni
  $3 = san

Using tools such as ps or top, you are able to see the processes running on a machine. However, you can’t see the directory from which the process was started. Knowing the working directory can be useful if, for example, you need to move a script or program to stop a fork bomb, if you want to see where a script or program lives or, if a script or program reads files using a relative path, to see which files are being read.

This working directory can be found using the pwdx <pid> [<pid> ...] utility. For example, consider the following output from ps:

UID        PID  PPID  C STIME TTY          TIME CMD
everyone  2646     1  2 21:21 ?        00:00:00 terminal
everyone  2651  2646  0 21:21 pts/0    00:00:00 bash

We can see that terminal and bash are running, but we don’t know where these processes were started. Using pwdx, we can easily find out:

$ pwdx 2646 2651
2646: /home/everyone
2651: /home/everyone

Consider the following query (tested on PostgreSQL – some other systems may require a table to be specified):

SELECT 'Yes' AS Value_Returned
WHERE 1 != 2;

This query returns 1 row: ( Value_Returned = 'Yes' ), as one would expect. But what if we compare against NULL?

SELECT 'Yes' AS Value_Returned
WHERE 1 != NULL;

0 rows returned, even though 1 is not NULL. This is because of the way logic works for NULLs; <anything> != NULL and <anything> = NULL always return UNKNOWN, which is not TRUE. UNKNOWN AND TRUE equals UNKNOWN, and UNKNOWN AND FALSE equals FALSE.

Similarly, any NOT IN operation using a set containing NULL will never return TRUE. For example:

SELECT 'Yes' as Value_Returned
WHERE 2 NOT IN (1, NULL);

does not return any rows.

In a C or C++ program, fork() can be used to create a new process, known as a child process. This child is initially a copy of the the parent, but can be used to run a different branch of the program or even execute a completely different program. After forking, child and parent processes run in parallel. Any variables local to the parent process will have been copied for the child process, so updating a variable in one process will not affect the other.

Consider the following example program:

#include <stdio.h>
#include <unistd.h>

int main(int argc, char **argv)
{
    printf("--beginning of program\n");

    int counter = 0;
    pid_t pid = fork();

    if (pid == 0)
    {
        // child process
        int i = 0;
        for (; i < 5; ++i)
        {
            printf("child process: counter=%d\n", ++counter);
        }
    }
    else if (pid > 0)
    {
        // parent process
        int j = 0;
        for (; j < 5; ++j)
        {
            printf("parent process: counter=%d\n", ++counter);
        }
    }
    else
    {
        // fork failed
        printf("fork() failed!\n");
        return 1;
    }

    printf("--end of program--\n");

    return 0;
}

This program declares a counter variable, set to zero, before fork()ing. After the fork call, we have two processes running in parallel, both incrementing their own version of counter. Each process will run to completion and exit. Because the processes run in parallel, we have no way of knowing which will finish first. Running this program will print something similar to what is shown below, though results may vary from one run to the next.

--beginning of program
parent process: counter=1
parent process: counter=2
parent process: counter=3
child process: counter=1
parent process: counter=4
child process: counter=2
parent process: counter=5
child process: counter=3
--end of program--
child process: counter=4
child process: counter=5
--end of program--

C/C++ gotcha – using #if true

Posted: 14th April 2014 by Tim in C, C++
Tags: , , , , , , ,

Consider the following code, which compiles without warnings with both gcc and g++:

#include <stdio.h>

int main(int argc, char **argv)
{
#if true
    printf("This does what you expect\n");
#else
    printf("This does not do what you expect!\n");
#endif

    return 0;
}

When compiling with g++, the program prints This does what you expect. However, when compiling with gcc, This does not do what you expect!

The problem here is with the #if true statement. In C++, true is a keyword which (unsurprisingly) evaluates to true. However, in C there is no such keyword, so true is just an undefined macro. #if <undefined_macro> will always evaluate to false, hence why the #else block is evaluated instead.

If you’re writing code which is used in both C and C++, use #if 0 or #if 1 instead as this is guaranteed to behave in the same way in both languages.

Adding lyrics to sheet music with Lilypond

Posted: 26th March 2014 by Tim in LilyPond

Lilypond is a useful tool for typesetting music. previously, I explained the basics of how to create sheet music for Mary Had A Little Lamb. This post will explain how to add the lyrics. This post follows on from the previous post, so read that first if you haven’t done so already.

Adding lyrics to your music only requires two extra steps:

1) Write down the lyrics

Lyrics need to be written inside a \lyricmode block. A couple of things to note here:

  • By default, each word is associated with one note. To skip a note, use "".
  • If a word spans multiple notes, split the word on the note boundaries and add -- between them. See the example below.

The lyrics for Mary Had A Little Lamb would look like this:

words = \lyricmode {
    Ma -- ry had a
    lit -- tle lamb
    lit -- tle lamb
    lit -- tle lamb
    Ma -- ry had a
    lit -- tle lamb whose
    fleece was white as snow
}

In the above, words is the name given to this block of lyrics, which will be used later.

2) Add the lyrics to the score

This can be done with \addlyrics \words where \words is the name given above.

The full document would look like this:

song = \relative c' {
    \clef treble
    \key c \major
    \time 4/4

    e4 d c d e e e2 d4 d d2 e4 e e2
    e4 d c d e e e c d d e d c2 r2
}

words = \lyricmode
    Ma -- ry had a
    lit -- tle lamb
    lit -- tle lamb
    lit -- tle lamb
    Ma -- ry had a
    lit -- tle lamb whose
    fleece was white as snow
}

\score {
  <<
    \new Staff \song
    \addlyrics \words
  >>
}

which will generate this:

Lilypond lyrics example - Mary Had A Little Lamb

When working with large documents with tens (or hundreds) of pages, it’s useful to be able to scroll directly to the section you’re interested in by clicking the section in the table of contents. In LaTeX, this functionality can be added quickly and easily in just a few lines using the hyperref package (and the color package if you want the links to be colored).

This post extends on this post how to add a table of contents to a LaTeX document. If you don’t know how to do that, read that post first.

To make the links clickable, we need to add the packages and configuration to the preamble – the part before \begin{document}. A typical configuration may look something like this:

\usepackage{color}
\usepackage{hyperref}
\hypersetup{
    colorlinks=true, % make the links colored
    linkcolor=blue, % color TOC links in blue
    urlcolor=red, % color URLs in red
    linktoc=all % 'all' will create links for everything in the TOC
}

The configuration is fairly self-descriptive. With this, we will have a table of contents with links, as well as clickable website URLs (always useful).

The full working example will produce this document:

\documentclass[12pt, a4paper]{article}

\usepackage{color}
\usepackage{hyperref}
\hypersetup{
    colorlinks=true,
    linkcolor=blue,
    urlcolor=red,
    linktoc=all
}

\begin{document}

\tableofcontents
\newpage

\section{First Section}
\subsection{First part of the first section}
Source code for this can be found at \url{http://timmurphy.org/2014/03/11/latex-table-of-contents-with-clickable-links}
\subsection{Second part of the first section}
\ldots

\section{Second Section}
\subsection{First part of the second section}
\ldots
\end{document}

Using environment variables in C++

Posted: 26th February 2014 by Tim in C++
Tags: , , , , , ,

Sometimes you need to use environment variables from within your program. There are a few ways to get the environment into your program, but the most portable way is to use the getenv function. The function will return a pointer to the null-terminated string value, or NULL if the variable is not set.

For example, the following program will print the value of $HOME if it is set. It will return 0 if the variable is set, or 1 if it’s not.

#include <cstdlib>
#include <iostream>

int main()
{
    char *homePath(getenv("HOME"));
    if (homePath == NULL)
    {
        std::cout << "$HOME is not set!" << std::endl;
    }
    else
    {
        std::cout << "$HOME is set to '" << homePath << "'" << std::endl;
    }

    return homePath == NULL;
}

There are many linux tools available to do search and replace, with sed being one of the most commonly used. However, tools like sed work line-by-line. If you need to replace/remove newline characters then things get complicated. It can be done with sed, but it’s not pretty.

The nicest solution I’ve seen is using awk. awk uses a Record Separator (RS) setting to determine how to split each record, and an Output Record Separator (ORS) setting to determine how to split the records as they are output. By default, RS and ORS are both set to '\n' (newline), meaning it reads in text line-by-line and outputs them in the same form. By changing ORS to something else, we can get all of the data printed on one line.

The examples below will use a file named random_data.txt which contains the following data:

18838ef123e
f33a244eb1e
4492b3091o9
9o7ef44b22e
77a1194g229

To replace the newline characters with a space, we can use the following:

awk '{ print $0; }' RS='\n' ORS=' ' < random_data.txt
18838ef123e f33a244eb1e 4492b3091o9 9o7ef44b22e 77a1194g229

ORS does not have to be one character:

awk '{ print $0; }' RS='\n' ORS=' :: ' < random_data.txt
18838ef123e :: f33a244eb1e :: 4492b3091o9 :: 9o7ef44b22e :: 77a1194g229 ::

The above commands are overly verbose, making it more obvious as to what's going on. However, both the RS value and the print $0 are default settings. RS can be omitted completely, and the print code can be replaced with the number 1. This 1 is a true condition, indicating to awk to use the default behaviour.

So to repeat the example of replacing newlines with a space, we can shorten the command to:
awk 1 ORS=' ' < random_data.txt
18838ef123e f33a244eb1e 4492b3091o9 9o7ef44b22e 77a1194g229

The shorter command is a bit more abstract but does the same job while cutting the command line length in half.

There are a few ways to do this, but one of the simplest ways to pretty-print code in LaTeX documents is to use the listings package. The package can be configured to use specific colors for different parts of the code, with many programming languages supported.

The following document will display code for both C++ and Java, with settings provided for the most common configuration:

\documentclass{article}
\usepackage{listings}
\usepackage{xcolor} % for setting colors

% set the default code style
\lstset{
    frame=tb, % draw a frame at the top and bottom of the code block
    tabsize=4, % tab space width
    showstringspaces=false, % don't mark spaces in strings
    numbers=left, % display line numbers on the left
    commentstyle=\color{green}, % comment color
    keywordstyle=\color{blue}, % keyword color
    stringstyle=\color{red} % string color
}

\begin{document}

\begin{lstlisting}[language=C++, caption={C++ code using listings}]
#include <iostream>
int main()
{
    // print hello to the console
    std::cout << "Hello, world!" << std::endl;
    return 0;
}
\end{lstlisting}

\begin{lstlisting}[language=Java, caption={Java code using listings}]
public class Hello
{
    public static void main(String[] args)
    {
        // print hello to the console
        System.out.println("Hello, world!");
    }
}
\end{lstlisting}

\end{document}

This will produce the following document:

LaTeX document with Java and C++ code syntax highlighting

The package is much more flexible than the example above shows; see the full documentation for more details.

C/C++ gotcha – ternary operator casting

Posted: 13th January 2014 by Tim in C, C++
Tags: , , , , , ,

C and C++ have what is known as a ternary operator; syntax which allows you to do conditional operations inline. This is done using syntax similar to:

const int b = (<condition> ? 10 : 100);

This will set b to 10 if <condition> is true, or 100 otherwise. Ternary operators allow code to be written more concisely, and allows you to do things like populating const variables conditionally as shown above, which cannot be done in a normal if (<condition>) {...} else {...} block. However, there is (at least) one thing to watch out for: casting.

Ternary operators return a value, and the data type must be the same for both results (ie: you cannot return an integer in one case and a string in another). If you attempt to return two compatible types of different sizes, such as a float and a double, the smaller value will be upcast (in this case, the float will be cast to a double). Why would that be a problem, you ask? Consider the following program:

#include <iostream>

int main()
{
    std::cout << (true ? 'a' : 100) << std::endl;
    return 0;
}

Running the program will print:

97

So what’s going on here? This simple program should print the letter ‘a’ if true == true (ie: always). However, the other return value is 100 which is an int, not a char. The ternary operator can only return one type, so the char ‘a’ (ASCII character 97) is upcast to an int, and the number 97 is printed instead.

Command line arguments with Perl

Posted: 27th December 2013 by Tim in Perl
Tags: , , , , ,

Command line arguments in perl are stored in the $ARGV array, and the number of arguments can be deduced from the size of that array: $#ARGV + 1. One way to access this data is to access the data by index: $ARGV[0] for the first argument for example. Unlike C, the program name is not passed as the first argument.

For example, the following Perl program will print all arguments passed to it:

print "There are " . ($#ARGV + 1) . " arguments:\n";
for (my $i = 0; $i <= $#ARGV; ++$i)
{
    print "  \$ARGV[" . $i . "] = " . $ARGV[$i] . "\n";
}

$ perl <script_name> first second third
There are 3 arguments:
  $ARGV[0] = first
  $ARGV[1] = second
  $ARGV[2] = third

The lattice multiplication method is a way of multiplying two numbers in a simple, concise form. It works a lot like the traditional method taught in schools but can be easier and faster for multiplying large numbers.

Let’s go through it step by step, using the example of 64 x 17:

1) Draw a box, with one number written horizontally along the top and the other number written vertically down the right-hand side.

Lattice multiplication - step 1

2) Split the box into smaller boxes so that each number on the top edge has it’s own column and each number on the side has it’s own row.

Lattice multiplication - step 2

3) Split each box diagonally from top-right to bottom-left.

Lattice multiplication - step 3

4) Multiply each row by each column and write the answer in the associated box. Tens values go to the left of the diagonal line, unit values go to the right.

Lattice multiplication - step 4.1

and so on until you have populated the entire box.

Lattice multiplication - step 4.2

5) Sum up each diagonal from right to left. Carry tens over to the diagonal on the left.

Lattice multiplication - step 5.1

Lattice multiplication - step 5.2

Lattice multiplication - step 5.3

Lattice multiplication - step 5.4

And you’re done! The answer is written along the edge of the box, from top-left down and across to bottom right.

Lattice multiplication - finished

Using this concise method, we’ve calculated that 64 x 17 = 1088. This can be used for numbers of any size; simply use a smaller or larger box as necessary.

Namespace Alias in C++

Posted: 26th November 2013 by Tim in C++
Tags: , , , , , ,

Sometimes you may find yourself working with namespaces which are really long to type. Writing out the whole namespace can be tedious and make the code harder to read, and using use namespace can sometimes make the code ambiguous (and is discouraged by some coding guidelines). The solution to this is to create a namespace alias, which allows you to reference the full namespace using a shorter name. You can create a namespace alias with the following syntax:

namespace <alias> = <full::namespace::reference>

<alias> can now be used where you would usually use <full::namespace::reference>. For example:

#include <iostream>

namespace My {
namespace Really {
namespace Long {
namespace Name {
namespace Space {

void sayHello()
{
    std::cout << "Hello!" << std::endl;
}

} // namespace Space
} // namespace Name
} // namespace Long
} // namespace Really
} // namespace My

namespace MyNS = My::Really::Long::Name::Space;

int main()
{
    MyNS::sayHello();
    return 0;
}

Will print:
Hello!

FIFOs are a very simple tool for communicating between processes. And using them in python is very easy. Simply call os.mkfifo(<path>) and treat the FIFO like any other file.

For example, we can create two simple python scripts, one for sending and one for receiving.

sender.py

import os

path = "/tmp/my_program.fifo"
os.mkfifo(path)

fifo = open(path, "w")
fifo.write("Message from the sender!\n")
fifo.close()

receiver.py

import os
import sys

path = "/tmp/my_program.fifo"
fifo = open(path, "r")
for line in fifo:
    print "Received: " + line,
fifo.close()

By running the programs in separate terminal windows, the receiver will print the following:
Received: Message from the sender!

Conditional INSERT in SQL

Posted: 23rd October 2013 by Tim in SQL
Tags: , , , , , ,

Sometimes you want to run an INSERT statement in SQL only if some condition is met. There are a few methods available to do this, but not all of them are supported by all database systems. One method which is supported on all systems the use of a SELECT statement to return the row values, with the condition set in that SELECT statement:

INSERT INTO <table> (<col1>, <col2>)
SELECT <val1>, <val2>
WHERE <condition>

One practical example of this is inserting a row only if it does not already exist. This can be done like so:

Table definition:
CREATE TABLE person (person_id int PRIMARY KEY, name varchar(20));

Row Insert:
INSERT INTO person (person_id, name)
SELECT 1, 'Me'
WHERE NOT EXISTS (SELECT 1 FROM person WHERE person_id = 1);

Running the row insert query for the first time will result in the row being inserted. If run a second time, no row is inserted because a row with person_id = 1 already exists.

Counting the number of occurrences of a given character in a std::string can be done using one function from the STL library: the std::count(...) function. This function takes three parameters: two iterators (the beginning and end of the desired search), and the item you wish to count. This function can be used for any STL container which uses iterators, such as vector and set.

For example, this full program:

#include <algorithm>
#include <iostream>
#include <string>

int main()
{
    std::string myString("The bubble sort algorithm is a good algorithm "
                         "for students to learn, but is usually too slow "
                         "for real world applications");
    std::cout << "The string '" << myString << "' has "
              << std::count(myString.begin(), myString.end(), 'a')
              << " occurrences of the letter 'a'" << std::endl;

    return 0;
}

Will print the following:

The string 'The bubble sort algorithm is a good algorithm for students to learn, but is usually too slow for real world applications' has 8 occurrences of the letter 'a'

Perl regular expressions are slightly different from grep (or egrep) regular expressions. grep is sufficient most of the time, but sometimes you may need the extra flexibility of Perl regular expressions, or may just want to test out a regular expression that you will use later in Perl code. This can be done on the command line like so:

perl -ne 'print if /my.regular.expression/' < file

For example, let’s take a file full of random data, named random_data.txt which contains the following:

18838ef123e
f33a244eb1e
4492b3091o9
9o7ef44b22e
77a1194g229

We can write a regular expression to print out the lines which are valid hexadecimal values. The following Perl regular expression can be used for this:

^[\da-fA-F]+$

Using the syntax above, we can print the lines in the file which are valid hexadecimal numbers like so:

perl -ne 'print if /^[\da-fA-F]+$/' < random_data.txt

Which will print:

18838ef123e
f33a244eb1e

Similarly, text can be cat to the perl command like so:

cat random_data.txt | perl -ne 'print if /^[\da-fA-F]+$/'