Fosbenner.net

Transitioning from Static Webpages

Posted: 1-25-19 Updated:

If you are like me, and do not consider yourself to be a web developer, perhaps you like the idea of having a program that can generate HTML for you. If you wanted to create a new webpage, you could define the content and allow the program to do all of the markup for you. This is the idea that motivated me while developing this website. If I could spend the time writing the program to generate the HTML, then I would not have to rewrite a new HTML document everytime I want to post a new page on this website.

Of course, this is not a new idea. Many websites are dynamic, much more so than mine. There are a number of ways to do this, with different programming languages, and different webservers and configurations. In this article, I will be describing one way of doing this using Python to generate the HTML, and using Apache as the webserver, running on Linux. Additionally, the scripts will be run using the Common Gateway Interface, or CGI. My examples will be written for Python 2.7, so they would probably require some tweaks to be compatible with Python 3.

Let me say, right off the bat, that CGI is an old standard, and is not the best way to run server-side scripts. The primary reason for this is that when a CGI script is run, the webserver forks a new process, which uses more system resources than if the script was run using a process that already existed. Still, as someone new to developing dynamic webpages, this seemed like a good place to start, in order to learn the basics. (I expect to write another article in the future about how to use a more modern technology, WSGI.)

Also, in the following two sections, my intent is not to write a foolproof methed to configure Apache and CGI on any server, anywhere. I have done this on CentOS 7, and I am sure that there are differences in default configurations on various Linux distributions. I recommend reading the Apache documentation for more details on configuring the httpd server. If you do not have a web server, but do have Python installed, you can run these examples directly in a terminal, and they will output the HTML that would otherwise be sent to a browser.

Getting Apache httpd running

Please note, for this whole section, you will have to run the commands as root. First, you will need to have the Apache httpd server installed on your server. On CentOS 7, you would install this with the following command:


yum install httpd
					

To start the webserver, and ensure that it starts automatically when the system boots, run:


systemctl start httpd.service
systemctl enable httpd.service
					

Now, open the HTTP port on the firewall (note-you should really implement HTTPS as well as HTTP, but I am not going to cover that in this article.):


firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-service=https
firewall-cmd --reload
					

To ensure that everything is working, open your browser and go to the name or IP address of your server. You should see a friendly Apache webpage telling you that everything is working.

Setting up CGI

Let's make sure that the CGI module is being loaded when Apache starts up. Run this command:


httpd -M
					

You should see a line that says either cgi_module or cgid_module. On CentOS 7, this module is loaded by default. If you see the module listed, you can skip ahead. If you do not see this module listed, you will have to edit the Apache configuration to tell it to load the module at startup. The MPM module that is loaded will help you determine which CGI module you should use. Looking again at the output from httpd - M, if you see mpm_prefork_module, you should use cgi_module. If you see either mpm_event_module or mpm_worker_module, you should use cgid_module. You can simply add a line to your httpd configuration using one of the following commands (as root):


echo "LoadModule cgi_module modules/mod_cgi.so" >> /etc/httpd/conf/httpd.conf
-OR-
echo "LoadModule cgid_module modules/mod_cgid.so" >> /etc/httpd/conf/httpd.conf
					

Then, restart the httpd server by running:


systemctl restart httpd.service
					

Now we know that the CGI module is loaded, let's look for the default cgi-bin directory. Open /etc/httpd/conf/httpd.conf and look for a ScriptAlias directive. On CentOS 7, the default is:


ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
					

This means that the /cgi-bin/ directory will be mapped to /var/www/cgi-bin/. Notice that this is outside of the default web root, which is /var/www/html. This is to help ensure that the source code of your CGI scripts cannot be downloaded. For example, if you browsed to www.foo.com/cgi-bin/bar, the server would actually be running /var/www/cgi-bin/bar.

To run Python scripts using CGI, you will have to add the following to your httpd configuration:


AddHandler cgi-script .py
					

Please note, if you are using Virtual Hosts in your Apache configuration, it would be better to place the ScriptAlias and AddHandler directives inside the <VirtualHost>, as opposed to in the global httpd config, but that is outside the scope of this article.

Writing a simple CGI script

Essentially, when you browse to a CGI page, Apache will run the CGI script and the output of the program will be sent to the browser. For our purposes, we want to output HTML, but it could be something else, like an XML document or plain text. The print statement in Python would normally print something to the console, but in this case, we can print HTML that in turn will be interpretted by the browser and displayed as a webpage. Consider the following example:


#!/usr/bin/python
print 'Content-Type: text/html\n'
print <!DOCTYPE html>
print '<html>'
print '<body>'
print '<h1>This is a header</h1>'
print '<p>This is a paragraph</p>'
print '<p>This is another paragraph</p>'
print '</body>'
print '</html>'
					

Note, the first print statement is needed by Apache for the HTTP headers; this will not actually be part of the HTML.

This example shows a very simple way to generate HTML with Python, but it's not very practical. It is essentially just a static HTML file being spit out by a script, which doesn't save the developer any time.

A Slightly More Interesting Example

Since HTML syntax follows only a few basic rules, we can write functions to do the markup for us. You have elements, which are made up of a start tag, some content, and an end tag. You also have empty elements, which do not have content or an end tag. By writing functions to construct the markup, we can begin our departure from manually writing HTML documents. Here, instead of printing the HTML straightaway, I will store each line in a list of strings that I can print later.

Read script on separate page or download
#!/usr/bin/env python

## FUNCTION DEFINITIONS ##
def st(tag, attr=""):
   """Generate HTML start tag"""
   if attr != "": #pad attr with a space
      attr = " " + attr
   return '<' + tag + attr + '>'

def et(tag):
   """Generate HTML end tag"""
   return '</' + tag + '>'

def elem(tag, content, attr=""):
   """Generate whole element"""
   return st(tag, attr) + content + et(tag)

def eelem(tag, attr=""):
   """Generate empty element"""
   attr += ' /' # add space, slash to end
   return st(tag, attr)

## START EXECUTION ##
title = 'Example Site'
desc = 'This page is generated by Python!'

out = []
out.append('Content-Type: text/html\n')
out.append('<!DOCTYPE html>\n' + st('html') + '\n' +
   elem('head', '\n\t' +
   elem('title', title) + '\n\t' +
   eelem('meta', 'name="description" content="' + desc + '"') + '\n\t' +
   eelem('link', 'rel="stylesheet" href="https://fosbenner.net/s/playground.css"') + '\n'))

out.append(elem('body', '\n\t' +
   elem('h1', 'This is a header') + '\n\t' +
   elem('p', 'This is a paragraph') + '\n\t' +
   elem('p', 'This is another paragraph') + '\n'
   ) + '\n' + et('html'))

for i in out:
   print i

				

Notice how I was able to nest elements within other elements. Also, to clarify, the newlines and tabs are just to help format the output to make it prettier; it is not necessary. Lastly, I used variables to store the title and description, and string literals to store the content of the page. This was just to show two different ways to accomplish the same thing. You could also use variables inside any of the functions, which could allow you to iterate though a long list and create an element for each item, as I will demonstrate next.

Create Elements Iteratively

So far, the examples have been pretty novel, but now we will begin to see the power of dynamic webpages over static ones. Let's say we have a long list of some data and we wish to display that data on a webpage in a structured way. Maybe it is a text file, and we want every line to be a paragraph on the webpage (wrapped in <p> tags). Maybe you have a database on your server, and you want use data from a table and display it nicely. Or maybe you want to scrape some data from other websites and reorganize it on your's. For the purpose of demonstration, to keep the example self contained, I will be generating some data, then using it to create HTML.

Read script on separate page or download
#!/usr/bin/env python

## FUNCTION DEFINITIONS ##
def st(tag, attr=""):
   """Generate HTML start tag"""
   if attr != "": #pad attr with a space
      attr = " " + attr
   return '<' + tag + attr + '>'

def et(tag):
   """Generate HTML end tag"""
   return '</' + tag + '>'

def elem(tag, content, attr=""):
   """Generate whole element"""
   return st(tag, attr) + content + et(tag)

def eelem(tag, attr=""):
   """Generate empty element"""
   attr += ' /' # add space, slash to end
   return st(tag, attr)

def is_prime(x):
   """check if number is prime"""
   from math import sqrt
   sq = sqrt(x)
   if sq.is_integer():
      # perfect square, not prime
      return False
   for i in xrange(2, int(sq)+1):
      if not (x % i):
         # found factor, not prime
         return False
   # made it this far, must be prime
   return True

## START EXECUTION ##
title = 'Prime Numbers'
desc = 'List of prime numbers in certain range'

# define range to check, inclusive
x = 10
y = 1000

out = []
out.append('Content-Type: text/html\n')
out.append('<!DOCTYPE html>')
out.append(st('html'))
out.append(st('head'))
out.append('\t' + elem('title', title))
out.append('\t' + eelem('meta', 'name="description" content="' + desc + '"'))
out.append('\t' + eelem('link', 'rel="stylesheet" href="https://fosbenner.net/s/playground.css"'))
out.append(et('head'))

out.append(st('body'))
out.append('\t' + elem('h1', 'List of primes between ' + str(x) + ' and ' + str(y)))
out.append('\t' + st('ul'))

# add a list item of every prime number
for i in xrange(x, y+1):
   if is_prime(i):
      out.append('\t\t' + elem('li', str(i)))

out.append('\t' + et('ul'))
out.append(et('body'))
out.append(et('html'))

for i in out:
   print i

				

Previously, I had nested several of my functions inside one another, and chained them together, just to show that this could be done, but this time I broke the commands out into more of the out.append() phrases, as I think this is easier to read. The result will be more strings in the out[] list, but now the print statement will add all of the newlines for me.

The broader the range in that script, the more list items will be added to the webpage, without adding any additional code to the script. This should demonstrate how a clever programmer could generate a fairly large webpage with a relatively short script.

Adding User Input

The last thing I would like to demonstrate is how you can add interactivity by using some simple HTML forms in combination with Python's cgi module. By default, an HTML form is submitted using an HTTP GET, which will encode the the form data into the URL. I will be adding onto the previous example, in which I used x and y to specify the range of numbers to analyze. In the following example, you will notice that x and y still have default values, but these can be overriden by entering new values in the form, or by appending these variables to the URL; for example: ?x=30&y=600

Read script on separate page or download
#!/usr/bin/env python

## FUNCTION DEFINITIONS ##
def st(tag, attr=""):
   """Generate HTML start tag"""
   if attr != "": #pad attr with a space
      attr = " " + attr
   return '<' + tag + attr + '>'

def et(tag):
   """Generate HTML end tag"""
   return '</' + tag + '>'

def elem(tag, content, attr=""):
   """Generate whole element"""
   return st(tag, attr) + content + et(tag)

def eelem(tag, attr=""):
   """Generate empty element"""
   attr += ' /' # add space, slash to end
   return st(tag, attr)

def is_prime(x):
   """check if number is prime"""
   from math import sqrt
   sq = sqrt(x)
   if sq.is_integer():
      # perfect square, not prime
      return False
   for i in xrange(2, int(sq)+1):
      if not (x % i):
         # found factor, not prime
         return False
   # made it this far, must be prime
   return True

## START EXECUTION ##
import cgi

title = 'Prime Numbers'
desc = 'List of prime numbers in certain range'

# define range to check, inclusive
x = int(cgi.FieldStorage().getvalue('x', '10'))
y = int(cgi.FieldStorage().getvalue('y', '1000'))

# in case x is set higher than y
if x > y:
   x,y = y,x

out = []
a = lambda s: out.append(s)
a('Content-Type: text/html\n')
a('<!DOCTYPE html>')
a(st('html'))
a(st('head'))
a('\t' + elem('title', title))
a('\t' + eelem('meta', 'name="description" content="' + desc + '"'))
a('\t' + eelem('link', 'rel="stylesheet" href="https://fosbenner.net/s/playground.css"'))
a(et('head'))
a(st('body'))
a('\t' + st('form'))
a('\t\t x: ' + eelem('input', 'type="number" name="x" value="' +
   str(x) + '"'))
a('\t\t y: ' + eelem('input', 'type="number" name="y" value="' +
   str(y) + '"'))
a('\t\t' + eelem('input', 'type="submit" value="Get Primes"'))
a('\t' + et("form"))
a('\t' + elem('h1', 'List of primes between ' +
   str(x) + ' and ' + str(y)))
a('\t' + st('ul'))

# add a list item of every prime number
for i in xrange(x, y+1):
   if is_prime(i):
      a('\t\t' + elem('li', str(i)))

a('\t' + et('ul'))
a(et('body'))
a(et('html'))

for i in out:
   print i

				

You may notice that I changed some of the code a little bit. This time, you will see that I added a lambda function that performs out.append() for me. This means that I can write a() instead of out.append() and achieve the same thing.

Functionally, the program is improved by the use of the cgi module and some basic HTML forms. The FieldStorage().getvalue() method looks for a key in the URL (in this case x or y) and returns it's value. If that key is not found, it will return a default value (10 or 1000 for x and y respectfully). The forms on the page provide a simple way for the user to enter the data, without requiring understanding of how to encode a URL manually.

That does it for this episode. I hope you found this as interesting as I did.

Cheers,

Adam