XS is EaSY

fishbot

fish-kwpm@uc.org

C/C++ to Perl

Other linking methods

Learning Curve

First, why bind to C?

Case 1

Case 2

Case 3

XS Worked

Getting Started

This creates:

This will create a My-Foo directory that has the following files:


./Changes 
./MANIFEST 
./README
./Foo.xs
./lib/My/Foo.pm
./Makefile.PL
./ppport.h (for newer perls)
./t/My-Foo.t
You can edit Foo.xs, Foo.pm, and probably t/My-Foo.t, and be done.

Foo.xs

Let's start in the only unfamilar one, Foo.xs.

Foo.xs

Let's go


[ My-Foo ]=> perl Makefile.PL
Checking if your kit is complete...
Looks good
Writing Makefile for My::Foo
[ My-Foo ]=> make
cp lib/My/Foo.pm blib/lib/My/Foo.pm
/usr/bin/perl.exe /usr/lib/perl5/5.8/ExtUtils/xsubpp  -typemap
/usr/lib/perl5/5.8/ExtUtils/typemap  Foo.xs > Foo.xsc && mv Foo.xsc Foo.c
## compilation noise omitted ##
[ My-Foo ]=> make test
/usr/bin/perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib',
'blib/arch')" t/*.t
t/My-Foo....ok
All tests successful.
Files=1, Tests=1,  0 wallclock secs ( 0.12 cusr +  0.07 csys =  0.20 CPU)

Eeeeasy

Harder - opening files


#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include "ppport.h"

MODULE = My::Foo        PACKAGE = My::Foo

Actually doing something


#includes as before
#include 

int char_sum( char *str );

int
char_sum( char *str )
{
   int sum = 0;
   int len = strlen( str );
   int i;
   for ( i = 0; i < len; i++ )
      sum += str[i];

   return sum;
}

XSub Mapping

And the mapping for Perl:


MODULE = My::Foo        PACKAGE = My::Foo

int
char_sum( str );
    char * str;

First real test:

XS provides typemaps for int and char *. I can now compile and run this. And test it, even:


use List::Util qw{ sum };
{
    my $str  = "foo";
    my $psum = sum unpack 'C*', $str;
    my $xsum = My::Foo::char_sum( $str );
    ok( $psum == $xsum, "Matches!  Woo!" );
}

And make test lets me know that this passes.

So far...

This code is:

Hello world!


#include 

void
hello( void )
{
    printf( "Hello world!\n" );
}

MODULE = My::Foo        PACKAGE = My::Foo

void
hello( );

[ My-Foo ]=> perl -Mblib -MMy::Foo -e 'My::Foo::hello()'
Hello world!

Golf XS


MODULE = My::Foo        PACKAGE = My::Foo

void
hello( );
    CODE:
         printf( "Hello world!\n" );

Return from CODE


int 
sum_two( one, two );
   int one;
   int two;
   CODE:
      RETVAL = one + two;
   OUTPUT:
      RETVAL;

Behind the XS


MODULE = My::Foo        PACKAGE = My::Foo

void
hello( );
    CODE:
       printf( "Hello world!\n" );

xsubpp makes Foo.c


#line 18 "Foo.c"
XS(XS_My__Foo_hello); /* prototype to pass -Wmissing-prototypes */
XS(XS_My__Foo_hello)
{
    dXSARGS;
    if (items != 0)
    Perl_croak(aTHX_ "Usage: My::Foo::hello()");
    {
#line 13 "Foo.xs"
        printf( "Hello world!\n" );
#line 28 "Foo.c"
    }
    XSRETURN_EMPTY;
}

boot strapper


XS(boot_My__Foo); /* prototype to pass -Wmissing-prototypes */
XS(boot_My__Foo)
{
    dXSARGS;
    char* file = __FILE__;

    XS_VERSION_BOOTCHECK ;

    newXS("My::Foo::hello", XS_My__Foo_hello, file);
    XSRETURN_YES;
}

Boooring.

Really, nothing terribly exciting is happening here... in fact it's all pretty tedious. Which is one of two reasons xsubpp exists, to handle the tediousness of it all.

That, and not forcing people to learn and remember all the ins and outs of the Perl calling conventions.

Last .c example


XS(XS_My__Foo_char_sum)
{
    dXSARGS;
    if (items != 1)
    Perl_croak(aTHX_ "Usage: My::Foo::char_sum(str)");
    {
    char *  str = (char *)SvPV_nolen(ST(0));
    int RETVAL;
    dXSTARG;

    RETVAL = char_sum(str);
    XSprePUSH; PUSHi((IV)RETVAL);
    }
    XSRETURN(1);
}

Summary xsubpp

CODE versus Indirection

Can now start doing things

Datatypes

There are other datatypes that are interesting, but not that common, things like globs (GV) and code values (CV).

Scalars

Scalars can be:

Docs: pretty good

Using Perl Datatypes


IV
sum_of_squares( AV *in )
{
    IV last = av_len( in );
    IV sum = 0, i;

    for ( i = 0; i <= last; i++ )
    {
        SV **cur = av_fetch( in, i, 0 );
        sum += pow( SvIV( *cur ), 2 );
    }

    return sum;
}

Mapping with Perl Datatypes


IV
sum_of_squares( in );
   AV *in;

Coercion to references

And we can test it out:


[ My-Foo ]=> perl -Mblib -MMy::Foo -le 
    'print My::Foo::sum_of_squares([1, 2, 3])'
14

Note that this takes an arrayref

Cheating with Prototypes

Actually, we can sort of cheat a bit, with this mapping:


IV
sum_of_squares( in );
   AV *in;
   PROTOTYPE: \@

my @foo = ( 1, 2, 3 );
sum_of_squares( @foo );    # yes
sum_of_squares( 1, 2, 3 ); # no
&sum_of_squares( @foo );   # no 

Case study: Photo Mosiac

Imagine you have something like the Photomosaic code that Daniel and I wrote some time back. A very quick summary:

C Wins Here

You need a large candidate pool. 40k+

Now, here's an area where C will outstrip Perl in two ways:

Compactness of Storage


SV -> RV -> AV -> [ SV -> IVX, SV -> IVX, ... ]

Example dump...


> print Dump $foo = [ 1, 2 ];
SV = RV(0x100381c8) at 0x10010fa0
  REFCNT = 1
  FLAGS = (ROK)
  RV = 0x10010eb0
  SV = PVAV(0x10015f00) at 0x10010eb0
    REFCNT = 3
    FLAGS = ()
    IV = 0
    NV = 0
    ARRAY = 0x10025ba8
    FILL = 2
    MAX = 2
    ARYLEN = 0x0
    FLAGS = (REAL)
    Elt No. 0
    SV = IV(0x10029ac0) at 0x10011030
      REFCNT = 1
      FLAGS = (IOK,pIOK)
      IV = 1
    Elt No. 1
    SV = IV(0x10029ac8) at 0x1002ef24
      REFCNT = 1
      FLAGS = (IOK,pIOK)
      IV = 2

Compact custom types

Let's make our own vector type


typedef struct nvector_s {
    UV dims;
    U8 *data; // int[dims]
} nvector_t;

Typemaps

Create a typemap file and add this:


TYPEMAP
nvector_t *     T_PTRREF

Create vector type


nvector_t *
nvector_new( UV dims )
{
    nvector_t *vect;
    U8        *data;

    vect = (nvector_t *) calloc( 1, sizeof( nvector_t ));
    data = (U8 *) calloc( dims, sizeof( U8 ));

    vect->data = data;
    vect->dims = dims;

    return vect;
}

Destroy vector


void
nvector_free( nvector_t *vect )
{
    free( (void *) vect->data );
    free( (void *) vect );
}

Generate from AV


nvector_t *
nvector_fromAV( AV *in )
{
    I32 dim, i;
    nvector_t *vret;

    dim = av_len( in );
    
    if ( dim <= 0 )
        croak( "Vector dimensionality must be at least 1" );
    
    vret = nvector_new( dim + 1 );
    for ( i = 0; i <= dim; i++ )
    {
        SV **posSV = av_fetch( in, i, 0 );
        if ( ! SvIOK( *posSV ) )
            croak( "Array value at %u", i );

        vret->data[i] = SvIVX( *posSV );
    }
    
    return vret;
}

Do something with it...


long
nvector_sqdist( nvector_t *vecta, nvector_t *vectb )
{
    long sum = 0;
    int i;

    if ( vecta->dims != vectb->dims )
        croak( "Vector length mismatch: %u - %u", 
                vecta->dims, vectb->dims );

    for ( i = 0; i < vecta->dims; i++ )
    {
        int dist = abs( vecta->data[i] - vectb->data[i] );
        sum += pow( dist, 2 );
    }

    return sum;
}

Testing vector distance

Let's test it:


my $vecta = nvector_fromAV( [ 5, 2, 8 ] );
my $vectb = nvector_fromAV( [ 1, 3, 1 ] );
my $dist = nvector_sqdist( $vecta, $vectb );
print "$dist\n";

prints 66

Left holding the memories

Note that we have to manually destroy these vectors we build, or we are leaking memory. There are a bunch of strategies you can employ here:

Cheating by secreting

Just quickly, how to get Perl to garbage collect your C variable:


typedef struct foo_s {
   IV foo; UV bar; NV baz;
} foo_t;

"Encrapsulation"


SV *
make_foo( IV foo, UV bar, NV baz )
{
   SV *ret = newSVpv( "", sizeof( foo_t ));
   foo_t *fobj = (foo_t *) SvPVX( ret );
   fobj->foo = foo;
   fobj->bar = bar;
   fobj->baz = baz;
   return ret;
}

Accessing buried nuts


void
show_foo( SV *val )
{
   foo_t *fobj = (foo_t *) SvPVX( val );
   printf( "foo = %d; bar = %u; baz = %f\n",
      fobj->foo, fobj->bar, fobj->baz );
}

Testing hidden nuts

run:


$ perl -Mblib -MFoo -e 'my $foo = make_foo( 1, 2, 3 ); show_foo( $foo );'
foo = 1; bar = 2; baz = 3.000000

And if we Peek our $foo's PV:


PV = 0xe4860 "\1\0\0\0\2\0\0\0\0\0\0\0\0\0\10@\0"\0

Caveats...

What we can't do (yet)

Thanks

(questions, etc.)