Fuzzy Mathematics with FuzzPy (Part 2)

In the first part of this post, we explored the foundations of fuzzy sets and fuzzy graphs, and discussed how you can use FuzzPy, a general fuzzy mathematics library for python, to work with these types. Today, we will expand a bit on those foundations by learning about fuzzy numbers, and creating visualizations with FuzzPy.

Fuzzy Numbers

Think of a fuzzy number as a number whose actual value is uncertain. We do not have its value, but we have a set of estimations, each with an associated degree of likelihood, or grade. If you remember the definition of a fuzzy set, you will remember that the structure of a fuzzy set is very similar to what I just described, so it only makes sense to use a fuzzy set to represent what is known about a fuzzy number.

Effectively, a fuzzy number is a fuzzy subset of the real line \mathbb{R}, where the value of each member of the set is an estimated value of the fuzzy number, and the \mu-value of the member is its grade.

triangular_fuzzy_number

Triangular Fuzzy Number

One can represent the estimated values as well as their associated \mu-values in the form of a Cartesian coordinate system, with the Y-axis representing \mu-values, and the X-axis representing estimated values.
We can then see that certain patterns emerge in the shape of the visualization, and these patterns are used to distinguish between the different types of fuzzy numbers:

  • Polygonal sets are represented by an arbitrary set of segments on the plane.
  • Trapezoidal sets contain two kernels (estimations whose \mu-value is exactly 1.0), and two X-intercepts.
  • Triangular sets contain exactly one kernel, and two X-intercepts
  • Gaussian sets have \mu-values determined by a normal distribution, thus providing a smoother distribution.

There are a few additional types as well, but these are the four basic types, and are all supported out of the box by FuzzPy.

Because the points of interest in each type are different, instanciating a new FuzzyNumber in FuzzPy is done differently depending on the type. Let’s instantiate objects for each of these types using FuzzPy, and we’ll play a bit with these objects in a moment. Make sure you have FuzzPy installed first (easy_install -U fuzzpy) and save the following code in fuzzynums.py:

import fuzz
 
# Instantiating a PolynomialFuzzyNumber is done by providing
# a list of (value, mu) tuple to the constructor:
polygonal = [fuzz.PolygonalFuzzyNumber([(0.0, 0.0), \
    (0.25, 1.0), (1.0, 0.5), (1.3, 1.0), (1.8, 0.0)])]
 
polygonal += [fuzz.PolygonalFuzzyNumber([(0.0, 0.0), \
    (1.0, 0.5), (3.0, 1.0), (4.0, 1.0), (6.0, 0.0)])]
 
# A trapezoidal needs to be given a tuple containing the values
# of the kernels (mu = 1.0) and a tuple containing the values
# of the support (mu = 0.0)
trapezoidal = [fuzz.TrapezoidalFuzzyNumber((1.0, 2.0), \
    (0.3, 3.0))]
 
trapezoidal += [fuzz.TrapezoidalFuzzyNumber((1.0, 6.0), \
    (0.5, 10.0))]
 
# A triangular set needs to be provided the value of the
# kernel, and a tuple containing the values of the support.
triangular = [fuzz.TriangularFuzzyNumber(1.0, (0.0, 3.0))]
triangular += [fuzz.TriangularFuzzyNumber(4.0, (0.0, 4.5))]
 
# Gaussian fuzzy sets simply need the mean, and standard
# derivation for the estimated value.
gaussian = [fuzz.GaussianFuzzyNumber(1.0, 0.2)]
gaussian += [fuzz.GaussianFuzzyNumber(12.0, 1.0)]

We can now fire up the python interpreter and work with our objects:

>>> # Polygonal Union:
... print polygonal[0] | polygonal[1]
PolygonalFuzzyNumber: kernel [(0.25, 0.25), (1.3, 1.3), \
(3.0, 4.0)], support [(0.0, 6.0)]
 
>>> # Polygonal Intersection:
... print polygonal[0] & polygonal[1]
PolygonalFuzzyNumber: kernel [], support [(0.0, 6.0)]
 
>>> # Trapezoidal Addition:
... print trapezoidal[0] + trapezoidal[1]
TrapezoidalFuzzyNumber: kernel (2.0, 8.0), support (0.8000000\
0000000004, 13.0)
 
>>> # Gaussian Addition
... print gaussian[0] + gaussian[1]
GaussianFuzzyNumber: kernel (13.0, 13.0), support (4.85663149\
07018668, 21.143368509298135)

The class inheritance chain for FuzzPy’s FuzzyNumber derived classes looks like this:

fuzzy_number_class_inheritance.png

Fuzzy Number Class Inheritance

Visualizations

FuzzPy ships with a small visualization framework that is able to create visualizations for almost all FuzzPy types, and encode them in a variety of formats. In most cases, creating and storing a visualization is a trivial operation:

from fuzzynums import *
from fuzz.visualization import VisManager
 
gaussian = fuzz.GaussianFuzzyNumber(1.0, 0.2)
plugin = VisManager.create_backend(gaussian)
(vis_format, vis_data) = plugin.visualize()
 
with (open("visualization.%s" % vis_format, "wb")) as fp:
    fp.write(vis_data)

In this example, we passed our FuzzyNumber instance to the VisManager.create_backend() method, which is a plugin factory method, and it returned the first plugin it could find that can be used to create a visualization of our FuzzyNumber.

The plugin’s visualize() method was called, and returned a tuple containing the file format of the visualization, and its payload. We then created a file using the returned format as extension, and stored the payload in that file.

You can also force the VisManager object to use a certain plugin if you do not want to rely on the default plugin picked for the type of object to visualize:

plugin = VisManager.create_backend(gaussian, 'num_gnuplot')

Similarly, a plugin’s visualize() method can accept arguments to customize the final output. For example, the following code would tell the plugin to use the ‘gif’ image format:

(vis_format, vis_data) = plugin.visualize(format="gif")

You will need to have Graphviz installed for graph visualizations, and Gnuplot for fuzzy number visualizations.

Now that you know how to use all the FuzzPy types and how to create visualizations for your data, you can start using FuzzPy in your own project, or get involved in the development process. Check out the project’s GitHub page for more information.

fuzzy_digraph

Fuzzy Digraph Visualization

polygonal_fuzzy_number

Polygonal Fuzzy Number

Fuzzy Mathematics with FuzzPy (Part 1)

FuzzPy is a python library that exposes specialized datatypes to deal with a wide range of fuzzy number types, fuzzy sets, and fuzzy graphs. Binary operations against each type, such as set unions and intersections, can be performed using some of python’s native binary operators (|, &, ==), or specialized methods if you wish to deviate from the default functions for these computations.

FuzzPy also provides several helper methods such as kernel and neighbors isolation, alpha-cuts, cardinality testing, shortest-path computation, and minimum spanning trees. A powerful visualization framework also allows you to quickly create and save visualizations for your data.

In this post, we will examine fuzzy sets and fuzzy graphs, and see how one can use FuzzPy to work with these types, providing examples for each. In the next post of this series, we will examine the different types of fuzzy numbers, generate visualizations, and explore advanced concepts such as automatic fuzzification, and importing graph data from fuzzy adjacency matrices.

Fuzzy Sets

A fuzzy set is simply a set in which each element has an associated degree of membership into the set. This membership value is called \mu-value of the element. A \mu-value of zero indicates that the element does not belong to the set, while any value between zero (exclusive) and one (inclusive), indicates that the element is, to a certain degree, a set member.

Creating a fuzzy set with FuzzPy is just a matter of creating a new FuzzySet object, then either calling the add(FuzzyElement(value, mu)) method for each element to import, or using the update(elements) method, providing it with a list of FuzzyElement(value, \mu-value) tuples for bulk import.

from fuzz import FuzzySet, FuzzyElement
 
# Create two lists of FuzzyElements
elements_a = [(1, 0.3), (2, 0.5), (3, 0.7), (4, 0.9)]
elements_b = [(3, 0.8), (6, 0.6), (1, 0.4), (8, 0.2)]
 
elements_a = [FuzzyElement(x[0], x[1]) for x in elements_a]
elements_b = [FuzzyElement(x[0], x[1]) for x in elements_b]
 
# Create two fuzzy sets
set_a = FuzzySet()
set_b = FuzzySet()
 
set_a.update(elements_a)
set_b.update(elements_b)

We’ve just created a couple fuzzy sets with four elements each. Notice that two of the elements are common to both set_a and set_b (albeit with different \mu-values). We can now perform several operations against these sample sets. Save the code above in a file called example.py and fire up the python interpreter to test fuzzy set functionality against our sets:

>>> from example import *
>>> # Set Difference
...
>>> set_a - set_b
FuzzySet([FuzzyElement(2, 0.500000), FuzzyElement(4, 0.900000)])
>>> set_b - set_a
FuzzySet([FuzzyElement(8, 0.200000), FuzzyElement(6, 0.600000)])
# Set Union
...
>>> set_a | set_b
FuzzySet([FuzzyElement(1, 0.400000), FuzzyElement(2, 0.500000), FuzzyElement(3, 0.800000), FuzzyElement(4, 0.900000), FuzzyElement(6, 0.600000), FuzzyElement(8, 0.200000)])
# Set Intersection
...
>>> set_a & set_b
FuzzySet([FuzzyElement(1, 0.300000), FuzzyElement(3, 0.700000)])

Fuzzy graphs

In a fuzzy graph or digraph, each vertice and each edge can be associated with a \mu-value. The \mu-value of a vertice indicates the vertice’s degree of membership into the graph, while an edge or arc’s \mu-value indicates the degree of connectivity between the head and tail vertices.

The creation of fuzzy graphs is also pretty straightforward: create a set or fuzzy set (see above), then feed that set to the fuzz.Graph constructor, along with the directed boolean value to specify whether you’d like to create a graph or a digraph. You can then use the fuzz.Graph.connect(head, tail, mu) method to add edges to the graph:

import fuzz
 
# Create two sets of vertices
set_a = set([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
set_b = set([11, 12, 13, 14, 15, 16, 17, 18, 19, 20])
 
# Create our graph objects
graph_a = fuzz.FuzzyGraph(viter=set_a, directed=False)
graph_b = fuzz.FuzzyGraph(viter=set_b, directed=True)
 
# Create some arcs in graph_a
graph_a.connect(1, 3, 0.5)
graph_a.connect(3, 6, 0.06)
graph_a.connect(2, 8, 0.2)
graph_a.connect(4, 1, 0.9)
 
# And some in graph_b
graph_b.connect(11, 19, 0.9)
graph_b.connect(14, 12, 0.3)
graph_b.connect(15, 14, 0.5)

Save the code above in a file named graphs.py and fire up your python interpreter so we can experiment with graph operations:

>>> from graphs import *
>>> # Retrieve the mu value of the edge between 1 and 3
... 
>>> graph_a.mu(3, 1)
0.5
>>> # Retrieve an alpha-cut of graph_b against mu-value of 0.5
... 
>>> graph_b.alpha(0.5)
Graph(viter = set([11, 12, 13, 14, 15, 16, 17, 18, 19, 20]), eiter = set([(15, 14), (11, 19)]), directed = True)
>>> # Is graph_a a subgraph of graph_b? (no)
... graph_a.issubgraph(graph_b)
False
>>> # Detect adjacent (connected) vertices
...
>>> graph_a.adjacent(1, 2)
False
>>> graph_a.adjacent(1, 3)
True
>>> # Find all the neighbours of the vertex with value 3
... 
>>> graph_a.neighbors(3)
set([1, 6])
>>> # Find the shortest path to all vertices using 6 as origin
... 
>>> graph_a.dijkstra(6)
{1: 3, 2: None, 3: 6, 4: 1, 5: None, 6: None, 7: None, 8: None, 9: None, 10: None}
>>> # Find the minimum spanning tree of graph_a
... 
>>> graph_a.minimum_spanning_tree()
Graph(viter = set([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]), eiter = set([(2, 8), (1, 3), (4, 1), (3, 6)]), directed = False)

There are a few more operations available for both crisp and fuzzy graphs/digraphs. You should consult the API doc for a comprehensive list.

You should also consult the examples provided as part of the module on GitHub to see FuzzPy in action.

Algorithms in Python: Binary exponentiation

The typical approach to exponentiation of a base b by an exponent n is to repeatedly multiply the base by itself, as such: b^n = \prod_{i=1}^{n} b

This is easy to compute for small values but quickly chokes with very large values.

A faster approach involves converting the exponent into base 2, then multiplying the running total and squaring the base each time a 1-bit is encountered. The python implementation looks like this:

def binary_exponent(base, exponent):
    """\
    Binary Exponentiation
 
    Instead of computing the exponentiation in the traditional way,
    convert the exponent to its reverse binary representation.
 
    Each time a 1-bit is encountered, we multiply the running total by
    the base, then square the base.
    """
    # Convert n to base 2, then reverse it.
    exponent = bin(exponent)[2:][::-1]
 
    result = 1
    for i in range(0, len(exponent)):
        if exponent[i] == '1':
            result *= base
        base *= base
    return result

Similarly, the same approach can be taken to perform modular exponentiation against very large exponents:

def modular_exponent(base, exponent, mod):
    """\
    Modular exponentiation through binary decomposition.
 
    We use the same technique as for the binary exponentiation above in
    order to find the modulo of our very large exponent and an arbitrary
    integer mod.
    """
    exponent = bin(exponent)[2:][::-1]
 
    x = 1
    power = base % mod
    for i in range(0, len(exponent)):
        if exponent[i] == '1':
            x = (x * power) % mod
        power = (power ** 2) % mod
    return x

The optimization is especially visible for the modulo operation, since we are working on smaller numbers with each iteration, whereas for simple exponentiation, the CPython interpreter can already optimize the operation.
The speedup for modular exponentiation on my system was in the order of 72x the speed of a simple (a^n) \mod{m} whereas the the computation of the exponentiation took about the same time in both cases.

Extending PostgreSQL with Python

One of the features I enjoy the most about PostgreSQL is the ability to write stored procedures in C, Perl, TCL, PgSQL, and yes… obviously also in Python. I’ve been using this feature since 7.4, so any recent version of PostgreSQL is pretty much guaranteed to support it, but you’ll need to have the pl/python procedural language contrib module installed. Once it’s installed, you can activate it for your current database using the following query:

CREATE PROCEDURAL LANGUAGE plpythonu;

Once the language bindings have been activated, you can start writing your stored procedures in python, however you should really read up on the following subjects first:

As an example, I’ve written a little SP in PL/Python to provide support for PCRE since the stock distribution of PostgreSQL only supports LIKE/SIMILAR, and POSIX Style Regular Expressions.

Let’s create our Python language binding, and create a standard text storage table:

-- Activate PL/Python
CREATE PROCEDURAL LANGUAGE plpythonu;
 
-- Create a plain text-storage table
CREATE TABLE text_storage
(
  id serial NOT NULL,
  payload CHARACTER VARYING(128),
  CONSTRAINT text_storage_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE text_storage OWNER TO xavier;
 
-- Let's throw in an index on the payload field for good measure
CREATE INDEX txt_payload_idx
  ON text_storage
  USING btree
  (payload);

Let’s now populate our table with some junk data:

INSERT INTO text_storage (payload) VALUES ('hello, world');
INSERT INTO text_storage (payload) VALUES ('the quick brown fox, blah blah blah');
INSERT INTO text_storage (payload) VALUES ('PCREs in Postgres');
INSERT INTO text_storage (payload) VALUES ('All hail Python!');
INSERT INTO text_storage (payload) VALUES ('Hello, test data!');
INSERT INTO text_storage (payload) VALUES ('Python would like to say Hello!');

And now we can go ahead and create our Python SP itself:

CREATE OR REPLACE FUNCTION pcre(text, text)
  RETURNS INTEGER AS
$BODY$import re
 
regex  = args[0]
in_str = args[1]
 
compiled = re.compile(regex)
 
IF compiled.search(in_str):
	RETURN 1
ELSE:
	RETURN 0$BODY$
  LANGUAGE 'plpythonu' VOLATILE
  COST 100;

As you can see, our PCRE matching system is extremely simple, yet pretty powerful. We import Python’s built-in re module, compile the specified regex argument, then attempt to match it against the other argument. Here’s a usage example on our test table:

SELECT id, payload FROM text_storage WHERE pcre('[H|h]ello', payload) = 1;
 id |             payload             
----+---------------------------------
  1 | hello, world
  5 | Hello, test DATA!
  6 | Python would LIKE TO say Hello!
(3 ROWS)

As always, feel free to suggest any improvements.

Fixing custom sequences in PostgreSQL

PostgreSQL provides a mechanism called sequences, which I believe is extracted from ANSI-SQL92, though I’m too lazy to check, which are basically stateful counters that provide some helper functions. The primary use of sequences in PostgreSQL and Oracle, is to implement auto-incrementing counters as a table field.

PostgreSQL will automatically create a sequence when you use the “SERIAL” datatype for your field, and will take care of assigning the default value of your field as the result of the nextval() call on the sequence, so most of the time you don’t need to interact directly with sequences.

Where things can get hairy however, is when you back up your data using pg_dump or any other mechanism, and restore that data in a table that is already populated. A common scenario for example, is populating a table that already has some recent data, with some older data you’ve been storing in archival. The opposite is true if you are trying to archive data from a table to a backup database for example.

Whenever you manually have to provide a value for the field assigned to a sequence, you are pretty much guaranteed to break the sequence unless you take the time to nextval() your sequence until it is in sync. This is a real problem, as pg_dump does not include sequence synchronization in its output.

The quickest way to synchronize a sequence, based on my observations, is to run the following query, taking care to replace the name of the sequence [SEQ] and the name of the associated table [TABLE] and field [FIELD]:

SELECT SETVAL([SEQ], COALESCE((SELECT [FIELD] FROM [TABLE] ORDER BY [FIELD] DESC LIMIT 1), 0)+1)

This will fetch the highest value of [FIELD] in [TABLE], increment it by 1, and synchronize the sequence [SEQ] to the new value.

I’ve also written the following little Python script that will look for any non-system sequence in the specified database, and use this method to repair it. Let me know if it works out for you, or if you’d like to suggest some improvements.

Requirements: Python 2.5+, PsycoPG2 Python module (python-psycopg2)

#!/usr/bin/env python
# -*- coding: utf-8 -*-
 
 
# Standard imports
import sys
import os
import time
from optparse import OptionParser
 
# psycopg2 import
try:
    import psycopg2
except ImportError, e:
    print 'You must install the python module named "psycopg2" in order to use this module.'
    sys.exit(os.EX_SOFTWARE)
 
 
 
class PgRepairman:
    def __init__(self, options, parser):
        self.options = options
        try:
            dsn = "dbname=%s host=%s user=%s" % (options.db, options.host, options.user)
            dsn += ("" != options.passwd) and ("password=%s" % options.password) or ""
            self.conn = psycopg2.connect(dsn)
            self.curs = self.conn.cursor()
        except Exception, e:
            print "ERROR - %s" % e
            sys.exit(1)
 
 
    # Returns a dict for a psycopg2 row object
    def _to_dict(self, desc, res):
        return dict(zip([x[0] for x in desc], res))
 
 
    # Attempt to locate all custom sequences
    def findSequences(self):
        seq_query = """
        SELECT pc1.relname AS seq, pc2.relname AS table, c.attname AS field 
        FROM pg_depend, pg_class pc1, pg_class pc2, pg_attribute c 
        WHERE pc1.oid = pg_depend.objid 
            AND pc2.oid = pg_depend.refobjid 
            AND c.attnum = pg_depend.refobjsubid 
            AND c.attrelid = pc2.oid 
            AND pc1.relkind = 'S' 
            AND pc1.relname NOT LIKE 'pg_toast%%'
        """
 
        try:
            self.print_verbose(seq_query)
            self.curs.execute(seq_query)
        except Exception, e:
            print "[ERROR] - %s" % e
            sys.exit(1)
 
        desc = self.curs.description
        for row in self.curs.fetchall():
            yield self._to_dict(desc, row)
 
 
    # Increment the key value to the value of the sequence + 1
    def fixSequences(self):
        for seq in self.findSequences():
            print "Fixing sequence %s in table %s" % (seq['seq'], seq['table'])
            fix_query = "SELECT setval('%s', COALESCE((SELECT %s FROM %s ORDER BY %s DESC LIMIT 1), 0)+1)" % (seq['seq'], seq['field'], seq['table'], seq['field'])
            try:
                self.print_verbose(fix_query)
                self.curs.execute(fix_query)
            except Exception, e:
                print "[WARNING] - %s" % e
                pass
 
    def print_verbose(self, msg):
        if (True == self.options.verbose):
            print "[DEBUG] - %s" % msg
 
if __name__=='__main__':
    usage       = "Usage: %prog <options> [-v --verbose] [-u --username | -p --password \ -o --host | -d --database]"
    version     = "%prog v1.0\nDistributed under the LGPL2 License"
    description = "Increments all sequences in a PostgreSQL database"
    parser = OptionParser(usage=usage, version=version, description=description)
    parser.add_option("-v", "--verbose", action="store_true", dest="verbose", default=False, help="Enable extra output")
    parser.add_option("-o", "--host", action="store", dest="host", default="127.0.0.1", help="Database hostname/IP")
    parser.add_option("-u", "--username", action="store", dest="user", default="postgres", help="Database Username")
    parser.add_option("-p", "--password", action="store", dest="passwd", default="", help="Database Password")
    parser.add_option("-d", "--database", action="store", dest="db", default="template1", help="Database Name")
 
 
    try:
        (options, args) = parser.parse_args()
        obj = PgRepairman(options, parser)
        obj.fixSequences()
    except KeyboardInterrupt, e:
        sys.exit(1)

You can also download the script here.