Posted by: sureshamrita | June 2, 2012

Sorting an array with original index information preserved

Some times it becomes necessary to sort an array with the original index information preserved. One easy solution is to create new array of pairs containing the original value and index, and sort the new array based on the values and from that the index position can be obtained. In this article, another solution using Boost.MultiIndex is provided.

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/identity.hpp>
#include <boost/multi_index/random_access_index.hpp>
#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;
using boost::multi_index_container;
using namespace boost::multi_index;

/*The following typedef says that mycontainer is a multiindex container, storing integers with 2 indices. One random access index, and the other is ordered based on integer values and non unique. The random access index provides operator[] and at() for accessing elements based on their position in the index. This method has the advantage that it can be adapted to any number indices for containers containing arbitrary (user defined )type of elements.
*/

typedef multi_index_container<int, indexed_by<random_access<>, ordered_non_unique<identity<int>>>> mycontainer;

int main() {
    vector<int> v { 1, 4, 2, 6, 6, 9, -12 };

    mycontainer mc;
    copy(v.begin(), v.end(), back_inserter(mc));

//original index

    for (auto x : mc) cout << x << " ";
    cout << endl;

//original index - another way
    const mycontainer::nth_index<0>::type& index0 = get<0>(mc);
    for (auto x : index0) cout << x << " ";
    cout << endl;

//original index yet another way
    for(int i = 0; i < mc.size(); i++) cout << mc[i] << " ";
    cout << endl;

//sorted index. note that the type of index1 cannot be replaced by auto
    const mycontainer::nth_index<1>::type& index1 = get<1>(mc);
    for (auto x : index1) cout << x << " ";
    cout << endl;

//obtaining the original index using project.

    for (auto x = index1.begin(); x != index1.end(); x++) {
        mycontainer::iterator i2 = mc.project<0>(x); // this line provides the iterator position before sorting, corresponding to the sorted iterator
        cout << distance(mc.begin(), i2) << " "; //distance gives the difference between two iterators
    }
    cout << endl;
}
Posted by: sureshamrita | March 7, 2012

Boost multiarray – some examples

The following code shows how a 2D array can be manipulated using boost::multi_array. The code shows how to fill the array, find the row/column sum, display alternate elements, replace alternate elements etc. 

#include <boost/multi_array.hpp>
#include <iostream>
#include <numeric>
using namespace std;

int main() {
	const int rows = 3;
	const int cols = 5;
	typedef boost::multi_array<double, 2> array_type;
	typedef array_type::index index;
	array_type A(boost::extents[rows][cols]);

	array_type::element* itr = A.data();

	for (int i = 0; i < rows * cols; i++)
		*itr++ = i;

	cout << "displaying matrix" << endl;
	for (int i = 0; i < rows; i++) {
		for (int j = 0; j < cols; j++)
			cout << A[i][j] << " ";
		cout << endl;
	}
    cout << endl;
    
	cout << "row sum" << endl;
	for (int i = 0; i < rows; i++)
		cout << accumulate(A[i].begin(), A[i].end(), 0.0) << " ";
	cout << endl << endl;
	
	
	cout << "row sum - another method" << endl;
	typedef array_type::index_range range;
	for (int i = 0; i < rows; i++) {
		array_type::array_view<1>::type myview =
				A[boost::indices[i][range()]];
		cout << accumulate(myview.begin(), myview.end(), 0.0) << " ";
	}
	cout << endl << endl;
	
	cout << "col sum" << endl;
	for (int i = 0; i < cols; i++) {
		array_type::array_view<1>::type myview =
				A[boost::indices[range()][i]];
		cout << accumulate(myview.begin(), myview.end(), 0.0) << " ";
	}
	cout << endl << endl;
	
	cout << "displaying odd positioned  elements in a row" << endl;
	for (int i = 0; i < rows; i++) {
		array_type::array_view<1>::type myview =
				A[boost::indices[i][range(0,5,2)]];
		copy(myview.begin(),myview.end(),ostream_iterator<double>(cout," "));
		cout << endl;
	}
	cout << endl << endl;
	
	
	cout << "displaying even positioned  elements in a row" << endl;
	for (int i = 0; i < rows; i++) {
		array_type::array_view<1>::type myview =
				A[boost::indices[i][range(1,5,2)]];
		copy(myview.begin(),myview.end(),ostream_iterator<double>(cout," "));
		cout << endl;
	}
	cout << endl << endl;
	
	cout << "replacing even positioned  elements with -10 " << endl;
	for (int i = 0; i < rows; i++) {
		array_type::array_view<1>::type myview =
				A[boost::indices[i][range(1,5,2)]];
		fill(myview.begin(),myview.end(),-10);
	}
    cout << "displaying matrix" << endl;
	for (int i = 0; i < rows; i++) {
		for (int j = 0; j < cols; j++)
			cout << A[i][j] << " ";
		cout << endl;
	}
	cout << endl << endl;
	
	return 0;
}
Posted by: sureshamrita | September 26, 2011

2D array from boost::multi_array

The following gives the code for a 2D array based on boost::multi_array. The code supports retrieval of arbitrary rows and columns. The code uses some features of C++ 11x. So use the appropriate compiler flag to compile the code.

Application Code:

#include "array2d.h"
#include <vector>
#include <iterator>
#include <iostream>
using namespace std;

int main(){
	//defining the array
        Array2d<int> array(2,5);
        //populating it
	Array2d<int>::element * itr = array.data();
	for(int i = 0; i < 10; i++){
		*itr++ = i;
	}

        //getting an arbitrary row
	vector<int>r;
	array.row<1>(back_inserter(r));
	copy(r.begin(),r.end(),ostream_iterator<int>(cout," "));

	cout << endl;

        //getting an arbitrary column
	vector<int>c;
	array.col<2>(back_inserter(c));

	copy(c.begin(),c.end(),ostream_iterator<int>(cout," "));
	return 0;
}

The array class definition is given below:


#ifndef ARRAY_H_
#define ARRAY_H_

#include <boost/multi_array.hpp>
#include <algorithm>
#include <iostream>
#include "mystrcat.h"
#include <stdexcept>

using namespace std;

template <class T>
class Array2d{
public:
	typedef typename boost::multi_array<T,2> array_type;
	typedef typename array_type::element element;
	typedef boost::multi_array_types::index_range range;

	Array2d(uint rows, uint cols);
	element * data();

	void display();

	template<int c, class Itr>
	void col(Itr itr); //dumps the cth column to the container pointed to by itr

	template<int r, class Itr>
	void row(Itr itr); //dumps the rth row to the container pointed to by itr

private:
	array_type array;

	uint rows;
	uint cols;
};

template <class T>
Array2d<T>::Array2d(uint _rows, uint _cols):rows(_rows),cols(_cols){
	array.resize(boost::extents[rows][cols]);
}

template <class T>
typename Array2d<T>::element* Array2d< T >::data(){
	return array.data();
}

template<class T>
template<int c, class Itr>
void Array2d<T>::col(Itr itr) {
	//copies column c to the given container

	if (c < 0 || c >= cols) throw runtime_error(Mystrcat("Only", cols, "columns available and you asked for column",c));
	typename array_type::template array_view<1>::type myview =
			array[boost::indices[range()][c]];
	std::copy(myview.begin(), myview.end(), itr);
}
template<class T>
template<int r, class Itr>
void Array2d<T>::row(Itr itr) {
	//copies row r to the given container
	if (r < 0 || r >= rows) throw runtime_error(Mystrcat("Only", rows, "rows available and you asked for row",r));
	typename array_type::template array_view<1>::type myview =
			array[boost::indices[r][range()]];
	std::copy(myview.begin(), myview.end(), itr);
}

template <class T>
void Array2d<T>::display(){
	for (auto i = array.data(); i < array.data()+array.num_elements(); i++)
		cout << *i << endl;
}

#endif /* 2DARRAY_H_ */
Posted by: sureshamrita | August 30, 2011

Current day and time using Boost

#include <iostream>
#include "boost/date_time/local_time/local_time.hpp"
using namespace std;

int main(){
  boost::posix_time::ptime now = boost::posix_time::second_clock::local_time();
  cout << now << endl;
}

Tested with Boost version 1.41

Posted by: sureshamrita | August 28, 2011

Extending the python configparser

The existing configparser module of python, does not provide the following operations.

  • If you are reading a directory/file name from a config file, it does not check if the directory/file exist
  • It does not even check if the config file itself is existing or not in its read() method.
  • It does not provide any specific methods to read and process multiple valued options, say comma separated. In such a situation the user would like to get the items as a list after removing the leading and trailing white space etc.
My guess is that such refined processing is left to the user of the configparser module. The following code defines a new class AmritaConfigParser which is derived from the SafeConfigParser class which is defined in the configparser module. The AmritaConfigParser class redefines/adds the following methods to remedy the above mentioned shortcomings.
  1. read(): It checks if the file is present and if it is present calls the base class read(). If the file is not present, raises an exception.
  2. getFile(section,option,check = True,dir='False') : Reads a dir or file name. If the check option is True, the existence of the read file is checked and an exception raised if it does not exist. If the dir option is made true, the read value is appended with '/' if it does not exist already. For windows the character appended would be '\'.
  3. getFileList(self,section,option,sep = ',' , check = True): It is same as getFile() method but separates the string at the sep character and removes the leading/trailing space and returns the list. Had python permitted function overloading, these two methods could have been made into one. Or is there any other tricks to combine these two methods into one?
The code is given below:
#!/usr/bin/python3.1
from configparser import  SafeConfigParser
import os

class AmritaConfigParser(SafeConfigParser):
    def __init__(self):
        super().__init__()

    def read(self,file):
        if not os.path.exists(file):
            raise OSError(file + ' does not exist')
        return super().read(file)

    def getFile(self,section,option,check = True,dir='False'):
        """gets the filename, checks if it is present (if check
        boolean is True)  or else gives an error
        message, and appends a / at the end if it does not exist already
        in case the retrieved file is a directory """

        file = super().get(section,option)
        if check:
            if not os.path.exists(file):
                raise OSError(file + ' does not exist')

        if dir:
            if not file.endswith(os.sep):
                file = file + os.sep

        return file

    def getFileList(self,section,option,sep = ',',check = True):
        """reads a string which contain 'sep' separated files.
        removes the trailing space between them
        checks for their presence if the check bool variable is True
        Finally returns a list """
        fileList = super().get(section,option)
        if fileList:
            fileList = fileList.split(',')
            fileList = [x.strip() for x in fileList]

            if check:
                for f in fileList:
                    if not os.path.exists(f):
                        raise IOError(f + ': File cannot be openend')
        return fileList
Posted by: sureshamrita | August 27, 2011

random_shuffle with Boost generator

The C++ algorithm random_shuffle accepts a random number generator as the third argument. One advantage of using your own random number generator is that  you can seed it the way you want. The following code segment demonstrates how it can be done with boost random number generator.

#include <boost/random/mersenne_twister.hpp>
#include <boost/random/uniform_int.hpp>
#include <boost/random/variate_generator.hpp>

boost::mt19937 generator(currentTimeNS());
boost::uniform_int<> uni_dist;
boost::variate_generator<boost::mt19937&, boost::uniform_int<> > randomNumber(generator, uni_dist);

vector<int> v = {1,2,3,4,5,6,7,8,9,10};
random_shuffle(v.begin(), v.end(),randomNumber);

The function currentTimeNS() gives the current time in nanoseconds which is used as a seed. For an implementation of  currentTimeNS() function see this link.

Posted by: sureshamrita | August 24, 2011

A C++ implementation of k-fold cross validation

In K-fold cross validation, the given data is randomly divided into k disjoint sets of equal size n/k, where n is the total number of patters present in the given data. The classifier is trained k times, each time with a different set held out as a validation set. The estimated performance is the mean of these k errors. (Taken from Duda, Hart, and Stork – Pattern Classification, 2006, page 483.

The given code implements a k-fold cross validation with the following features.

  • The code is independent of the container in which the given data is stored.
  • The code just expects an iterator pointing to the beginning of the container and one pointing to the end (to be precise, one past the end) of the container.
  • The partitioned data can be stored in any container.
  • The code expects only an output iterator pointing to the destination containers (one for training data and the other for testing data)

The header file “mystrcat.h” included in the following code is described here

#ifndef _kfoldh_
#define _kfoldh_
#include <vector>
#include "/home/suresh/C++/myCodes/mystrcat.h"
#include <stdexcept>
#include <algorithm>
#include <iterator>
#include <boost/foreach.hpp>

using namespace std;

#define foreach BOOST_FOREACH

template<class In>
class Kfold {
public:
	Kfold(int k, In _beg, In _end);
	template<class Out>
	void getFold(int foldNo, Out training, Out testing);

private:
	In beg;
	In end;
	int K; //how many folds in this
	vector<int> whichFoldToGo;

};

template<class In>
Kfold<In>::Kfold(int _k, In _beg, In _end) :
		beg(_beg), end(_end), K(_k) {
	if (K <= 0)
		throw runtime_error(Mystrcat("The supplied value of K is =", K,". One cannot create ",k, "no of folds"));

	//create the vector of integers
	int foldNo = 0;
	for (In i = beg; i != end; i++) {
		whichFoldToGo.push_back(++foldNo);
		if (foldNo == K)
			foldNo = 0;
	}
	if (!K)
		throw runtime_error(Mystrcat("With this value of k (=", k ,")Equal division of the data is not possible"));
	random_shuffle(whichFoldToGo.begin(), whichFoldToGo.end());
}

template<class In>
template<class Out>
void Kfold<In>::getFold(int foldNo, Out training, Out testing) {

	int k = 0;
	In i = beg;
	while (i != end) {
		if (whichFoldToGo[k++] == foldNo) {
			*testing++ = *i++;
		} else
			*training++ = *i++;
	}
}

#endif
//******************************Application Code
#include "kfold.h"
#include <vector>
using namespace std;

int main() {
	vector<int> v = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

	const int folds = 5;
	Kfold<vector<int>::const_iterator> kf(folds, v.begin(), v.end());

	vector<int> test, train;

	for (int i = 0; i != folds; i++) {
		kf.getFold(i + 1, back_inserter(train), back_inserter(test));
		cout << "Fold " << i + 1 << " Training Data" << endl;
		foreach(auto x,train)
					cout << x << " ";
		cout << endl;
		cout << "Fold " << i + 1 << " Testing Data" << endl;
		foreach(auto x,test)
					cout << x << " ";
		cout << endl;

		train.clear();
		test.clear();
	}

	return 0;
}
 

The code uses the auto keyword from C++ 0x. So please  select the appropriate compiler flag to include C++0x features. In the case of linux, the compiler flag is: -std=c++0x

Posted by: sureshamrita | August 24, 2011

Concatenating arbitrary number/types of arguments into a string

When exceptions are thrown from code, to make the received message meaningful, many a time one would like to combine variable values and strings into a const string. For example, if an exception is thrown because an argument is negative, one would like the message thrown to look like,  “expecting positive arguments, but the current value of the argument is  10”.   In order to create such strings, we need a function which can accept arbitrary number of arguments of arbitrary types and then concatenate all of them into a string. The following code does just that.

#include <sstream>
#include <iostream>
#include <string>

using namespace std;

class Mystrcat{
public:
  template<typename T, typename ...P>
  explicit Mystrcat(T t, P... p){init(t,p...);}

  operator const string(){return o.str();}

private:
  ostringstream o;
  void init(){}
  template<typename T, typename ...P>
  void init(T t, P... p);
};

template<typename T, typename ...P>
void Mystrcat::init(T t, P ...p){
  o << t << ' ';
  init(p...);
}
int main(){
  int x;
  cin >> x;
  if (!x) throw runtime_error(Mystrcat("The value you entered is", x, "but non zero value expected", "error from file:", __FILE__, "line no:",__LINE__));
}
The above code uses variadic templates, which is a feature of C++0x. Hence to use the above code, one should select the appropriate compiler flag to include C++0x features. In the case linux, the compiler flag is: -std=c++0x
To see an implementation of the above without using classes, look at this usenet post. A similar implementation is available here.
Posted by: sureshamrita | March 19, 2011

LaTeX + new fonts = XeLaTeX

I was fascinated when I saw that xelatex can use all the new fonts like Asana Math, Cambria (for text and math), Calibri etc, and I need not learn anything new apart from my existing latex understanding. This blog is the summary of my efforts in installing xelatex and the new fonts and making them work on my Ubuntu 10.04 machine!

First of all, for xelatex and the new fonts to work, one should have the latest tex distribution. Unfortunately Ubuntu 10.04 still have only texlive 2009. So the first step would be to acquire texlive 2010 and install it. Once it is done, the rest is trivial. Again installing texlive 2010 is also very easy – straightforward for at least Ubuntu.

1 (a). Installing TeXLive 2010 from a CD

One can get an iso image of TeXLive 2010 from the TeXLive Site. Once the iso image is burned into a DVD, you can do the following for installation.

  • You may need the perl-tk package before you start. You can install this on ubuntu by the command sudo aptitude install perl-tk . I got this information from the blog How to install TeXLive 2010 on Ubuntu 10.10
  • Once  you have installed perl-tk, and the DVD ready, you can follow the detailed instructions on the blog How to install Vanilla TexLive 2010 on Ubuntu 10.04.
  • Make sure to rename your local texmf tree if you have one. I forgot to do this and it created lot of difficulties for me.

1(b) Installing TeXLive 2010 from Internet

Go to this site.

2. Updating TeXLive 2010

  • First you have change the repository from which tlmgr would do the updates. It is done by the command,

tlmgr option repository http://mirror.ctan.org/systems/texlive/tlnet

  • Now you have to update tlmgr itself, if needed. Run the command tlmgr update --self to update the tlmgr itself.
  • Once tlmgr is updated, you can run tlmgr update -all to install all updates. Beware, it will take some time, even to start dumping messages on the screen.

3. Where to put microsoft/other fonts in texlive?

If your platform is windows, you can skip this step and go to section 4. If you are on a Ubuntu machine, you will not have access to the Microsoft fonts in the machine. You need to have the ttf files of the microsoft windows if you want to use them with latex.  Once  you get the font files, this is how you would install them. Getting the font files is described in the next section.

  • TeXlive will have a texmf-local folder inside your texlive installation. Inside that a fonts directory is present. Whatever ttf files you get hold of, you can create a directory inside the texmf-local/fonts/ and store the ttf files. For example I have the following directories there: texlive/texmf-local/fonts/vista, texlive/texmf-local/fonts/comic, texlive/texmf-local/fonts/Bergamo, texlive/texmf-local/fonts/sorts-mill-goudy, texlive/texmf-local/fonts/sorts-mill-goudy, texlive/texmf-local/fonts/STIXv1.0.0 etc.
  • After installing new fonts, you have to run the command fc-cache -fv from your home directory.

4. How to get the microsoft fonts for use in Ubuntu/LaTeX?

You may use any of the method described below:

5. You are done! Ready to test the comic font now

\documentclass[12pt]{article}
\usepackage{fontspec}
\usepackage{amsmath}
\usepackage{unicode-math}
\usepackage{lipsum}
\setromanfont{Comic Sans MS}
\setmathfont{XITS}
\begin{document}
\begin{equation}
x = \frac{-b \pm \sqrt{b^2 – 4ac}}{2a}
\end{equation}
\lipsum[1]
\end{document}

You can copy the above code from here

\documentclass[12pt]{article}
\usepackage{fontspec}
\usepackage{amsmath}
\usepackage{unicode-math}
\usepackage{lipsum}
\setromanfont{Comic Sans MS}
\setmathfont{XITS}
\begin{document}
\begin{equation}
    x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
\end{equation}
\lipsum[1]
\end{document}

This is the output on my machine!

6. How to compile the above code?

  • Save the above file as test.tex
  • xelatex test.tex
  • View the pdf file in your favourite pdf viewer!

7. Some Explanations for the LaTeX code

  • \usepackage{fontspec}: Allows  you to put arbitrary fonts using the command \setromanfont{font family name} . For example \setromanfont{Comic Sans MS}
  • \usepackage{unicode-math}: Allows you to put arbitrary math font using the command, for example,\setmathfont{XITS}

8. How to find the Font Family name?

In windows, this is easy, but in Ubuntu, you can use the command otfinfo. For example, otfinfo --family STIXVarBol.otf gave me the family name as STIXVariants which I can use in the setromanfont or setmathfont commands.

9. Acknowledgement

When I got stuck, many in the comp.text.tex and xetex mailing list helped me. I want to particularly thank, Will Robertson, who is the author of the fontspec and unicode-math packages.

Posted by: sureshamrita | March 14, 2011

Reducing page count in a latex document

Many a time it becomes necessary to reduce the page size of your document. I face this challenge when I submit a paper for a conference where the typical maximum page size is 8. This blog describes some of the techniques that I found useful. A good summary is available on http://www.eng.cam.ac.uk/help/tpl/textprocessing/squeeze.html

Images

  • scaling of images: \includegraphics command has a scale option. eg. \includegraphics[scale=.5]. This reduced the overall size of the figure.
  • Another way to use this command is \includegraphics[width=\columnwidth]. This will scale the width exactly to column width and trial and error approach is not needed. Basic idea from Dr Ninad.
  • \includegraphics command has a trim option which allows one to trim whitespace from the 4 sides of the image. Example: \includegraphics[trim = 1in 2in 3in 4in]. This command removes 1in, 2in, 3in, and 4in of space from the left, bottom, right, and top of the image.
  • \usepackage[belowskip=-15pt,aboveskip=0pt]{caption} will allow you to reduce the skip above and below the caption.
  • \setlength{\intextsep}{10pt plus 2pt minus 2pt}. This command will let you change the space left on top and bottom of an in-text float.

Enumerations

  • Use paralist package and use compactenum/compactitem instead of enumerate. This will reduce the space between items.
  • Use inparaenum of the paralist package to convert an ordinary enumeration to paragraphy style enumeration.

Section/Subsection Titles

  • Look for section/subsection titles that take up more than one line and reduce their length.

Mathematics

  • Look for equations that are typeset in its own lines but are not referred to, in the document else where. Such equations can be merged with text and at least 3 lines can be saved.

References

  • Look for the bibliography listing. In my last paper, I replaced “IEEE Transactions on Pattern Analysis and Machine Intelligence” with “IEEE Tr. PAMI” and saved at least 10 lines.! This idea is from Professor Bir Bhanu (my PhD supervisor)

Page Length

  • The page length of one specific page may be increased by one or two lines without affecting the overall appearance by using: \enlargethispage{2\baselineskip}  will increase the page size by 2 lines.

Linespacing

  • This is the ultimate page size controlling command. \renewcommand{\baselinestretch}{.97} In this example, I have reduced  the baselinestretch  to 97% of the normal value. Its normal value is 100%. By changing this, the total page count of the document can be changed drastically, but at the cost of the documents visual quality . This idea is from Professor Bir Bhanu (my PhD supervisor)

Older Posts »

Categories