PS3 Programming Log

My working log for learning programming for the Playstation 3. Inspired by Newcastle University's Computer Game Engineering Content where they have a few sections on PS3 programming. Unfortunately, they seem to use the official SONY PS3 Devkit, which of course we don't have access too. There is however an open source ps3sdk we can use: https://github.com/ps3dev. I've followed so far the first introduction section from the Newcastle content and it seems like a good resource; the only drawback is it's Windows and Visual Studio oriented. (Another future project TODO - convert their example code to SDL/GLEW for Linux support!)

For PS3 my current goal is to get MD5 skeletal animation working from the NewCastle tutorial, and possibly the GLTF skeletal animation from Hands-On C++ Game Animation Programming. Also, it'd be cool to get RSXGL working, which is an OpenGL 3.1 implementation, but I haven't had any luck so far.

The source code is currently here: https://bitbucket.org/williamblair/ps3devprogs

05/15/2021

Looking through the psl1ght graphics example. Seems actually pretty similar to OpenGL: we have shaders, buffer attributes, texture formats, etc...

The shaders are NOT compiled and loaded at runtime like in OpenGL; instead they are pre-compiled at the same time as the main program and stored as byte arrays and included as C headers.

The shader language seems pretty similar to GLSL except programs take in a bunch of function arguments intead of global vars:

void main
(
    float3 vertexPosition : POSITION,
    float3 vertexNormal : NORMAL,
    float2 vertexTexcoord : TEXCOORD0,
    
    uniform float4x4 projMatrix,
    uniform float4x4 modelViewMatrix,
    
    out float4 ePosition : POSITION,
    out float4 oPosition : TEXCOORD0,
    out float3 oNormal : TEXCOORD1,
    out float2 oTexcoord : TEXCOORD2
)
{
    ePosition = mul(mul(projMatrix,modelViewMatrix),float4(vertexPosition,1.0f));
    
    oPosition = float4(vertexPosition,1.0f);
    oNormal = vertexNormal;
    oTexcoord = vertexTexcoord;
}

Vertex buffer data seems to be stored on GPU memory but accessible from the CPU. The memory is allocated via rsxMemalign() and then vertex data is copied into it. There is an 'offset' associated with this allocation used as arguments to other functions; I'm assuming this is like the address of the memory block within GPU memory.

m_meshBuffer->vertices = (S3DVertex*)rsxMemalign(128,m_meshBuffer->cnt_vertices*sizeof(S3DVertex));
...
m_meshBuffer->vertices[i] = S3DVertex(pos.getX(),pos.getY(),pos.getZ(),normal.getX(),normal.getY(),normal.getZ(),tu,static_cast(ay*RECIPROCAL_PI));
u32 offset = 0;    
rsxAddressToOffset( &m_meshBuffer->vertices[0].pos, &offset );

The S3DVertex and SBuffer classes are included with the psl1ght sample as custom objects (mesh.h), so we don't have to use those if we don't want to.

As a side note, the main CPU is the PowerPC Processor Unit (PPU) and then there are eight SPUs to execute separate programs to run in parallel if desired (see here). RSX stands for reality synthesizer and is the PS3 GPU (see here.)

The compiled program results in a .elf file (similar to the PS2), but for the PS3 this needs to be converted into a .self file which I think is an encrypted version of the .elf. Using ps3load with the elf file didn't seem to work but using the .self file did. Conversion to a .self is done automatically with the ps1light sample makefile.

I remorked the sample code to remove the lighting (to simplify things), then made the following C++ classes:

Shader
Texture
Entity
- Sphere
- Torus
- Cube

Rendering an entity then works the same as opengl, where you set shader and vertex buffer attributes, then draw indexed arrays:

g_Shader.Use();
g_Shader.SetVertexProgParam( "projMatrix", (float*)&P );
g_Shader.SetVertexProgParam( "modelViewMatrix", (float*)&modelViewMatrix );
...
g_Sphere.Render();

TODO is to make a rendering class, also have a single shader SetAttribute function for both vertex shader and fragment shader attribs. I also made a Logging namespace and macro which logs to both TTY/stdout and the debug font sample to the screen:

#define DBG_LOG(format, ...)                                            \
    sprintf( Log::dbgPrintStr, "%s:%d: ", __FILE__, __LINE__ );         \
    Log::len = strlen( Log::dbgPrintStr );                              \
    sprintf( &Log::dbgPrintStr[ Log::len ], format, ##__VA_ARGS__ );    \
                                                                        \
    printf( Log::dbgPrintStr );                                         \
                                                                        \
    Log::dbgLogHistory.push_back( Log::dbgPrintStr );                   \
    if ( Log::dbgLogHistory.size() > LOG_HISTORY_LEN ) {                \
        Log::dbgLogHistory.pop_front();                                 \
    }

I originally tried an inline function but that resulted in the function not actually being able to be inlined, and thus the __FILE__ and __LINE__ preprocessor vars were not showing the desired call location, and instead were all showing the same line and function for inside the log call. So I used a macro instead. For the DBG_LOG macro, an important note is you need the two ## in front of __VA_ARGS__ in order to prevent errors when no extra args were provided to the macro. This might be gcc specific.

The result of this rework can be seen running in the rpcs3 emulator, though I've confirmed it runs on hardware too:

05/22/2021

Added pad class for the controller. Basic psl1ght API usage for the pad is as follows:

ioPadInit(7); // Idk what 7 means, TODO - look this up
...
padInfo info;
padData data;
int padnumber = 0;
while (true)
{
    ioPadGetInfo( &info );
    if ( info.status[ padnumber ] )
    {
        ioPadGetData( padnumber, &data );
        bool crossPressed = (bool)data.BTN_CROSS;
    }
}

The list of available info from padData, based on the generated doxygen from the ps3toolchain build is below (I could probably upload the generated doxygen here, it'd be easier than opening the html locally...):

typedef struct _pad_data
 {
     s32 len;                                
     union{
         u16 button[MAX_PAD_CODES];          
         struct {
             u16 zeroes;                     
             unsigned int : 8;               
             unsigned int seven : 4;         
             unsigned int halflen : 4;       
             unsigned int : 8;               
             /* Button information */
             /* 0: UP, 1: DOWN */
             unsigned int BTN_LEFT : 1;      
             unsigned int BTN_DOWN : 1;      
             unsigned int BTN_RIGHT : 1;     
             unsigned int BTN_UP : 1;        
             unsigned int BTN_START : 1;     
             unsigned int BTN_R3 : 1;        
             unsigned int BTN_L3 : 1;        
             unsigned int BTN_SELECT : 1;    
             unsigned int : 8;               
             unsigned int BTN_SQUARE : 1;    
             unsigned int BTN_CROSS : 1;     
             unsigned int BTN_CIRCLE : 1;    
             unsigned int BTN_TRIANGLE : 1;  
             unsigned int BTN_R1 : 1;        
             unsigned int BTN_L1 : 1;        
             unsigned int BTN_R2 : 1;        
             unsigned int BTN_L2 : 1;        
             /* Analog nub information */
             /* 0x0000 - 0x00FF */
             unsigned int ANA_R_H : 16;
             unsigned int ANA_R_V : 16;
             unsigned int ANA_L_H : 16;
             unsigned int ANA_L_V : 16;
 
             /* Pressure-sensitive information */
             /* 0x0000 - 0x00FF */
             unsigned int PRE_RIGHT : 16;
             unsigned int PRE_LEFT : 16;
             unsigned int PRE_UP : 16;
             unsigned int PRE_DOWN : 16;
             unsigned int PRE_TRIANGLE : 16;
             unsigned int PRE_CIRCLE : 16;
             unsigned int PRE_CROSS : 16;
             unsigned int PRE_SQUARE : 16;
             unsigned int PRE_L1 : 16;
             unsigned int PRE_R1 : 16;
             unsigned int PRE_L2 : 16;
             unsigned int PRE_R2 : 16;
 
             /* Sensor information */
             /* 0x0000 - 0x03FF */
             unsigned int SENSOR_X : 16;
             unsigned int SENSOR_Y : 16;
             unsigned int SENSOR_Z : 16;
             unsigned int SENSOR_G : 16;
 
             /* BD remote information */
             unsigned int BTN_BDLEN : 16;    
             unsigned int BTN_BDCODE : 16;   
             u8 reserved[76];
         };
     };
 } padData;

Then for my pad class, I made the following test to check button click/press/ vibration (called "actuators")/analog sticks:

Pad pad;
...
pad.Init( 0 ); // controller 0
..
while (true)
{
    pad.Update();
    if ( pad.IsHeld( Pad::CROSS ) ) {
        goto done;
    }
    
    // large actuator
    static unsigned short algVal;
    algVal = pad.GetLeftAnalogY();
    static unsigned short prevAlgVal = algVal;
    if ( algVal != prevAlgVal )
    {
        if ( algVal > prevAlgVal && algVal - prevAlgVal > 5 )
        {
            pad.SetLargeActuator( algVal );
        }
        else if ( algVal < prevAlgVal && prevAlgVal - algVal > 5 )
        { 
            pad.SetLargeActuator( algVal );
        }
        DBG_LOG( "Alg X: %u", algVal );
    }
    prevAlgVal = algVal;

    // small actuator
    if ( pad.IsClicked( Pad::SQUARE ) )
    {
        static bool actVal = false;
        actVal = !actVal;
        pad.SetSmallActuator( actVal );
    }
    
    ...
}

Also created renderer class. Currently relies too heavily on rsxutil.h/cpp, so a TODO is inspect those internals. Basic usage is as follows:

Renderer renderer;
Entity entity;
renderer.Init();
while (1)
{
    renderer.BeginFrame();
    renderer.RenderEntity( entity );
    renderer.EndFrame();
}

RenderEntity() actually forwards the important drawing to the Entity class; it sets render clip settings then just calls Entity::Render()

void Renderer::RenderEntity( Entity& entity )
{
    rsxSetUserClipPlaneControl(context,GCM_USER_CLIP_PLANE_DISABLE,
                                       GCM_USER_CLIP_PLANE_DISABLE,
                                       GCM_USER_CLIP_PLANE_DISABLE,
                                       GCM_USER_CLIP_PLANE_DISABLE,
                                       GCM_USER_CLIP_PLANE_DISABLE,
                                       GCM_USER_CLIP_PLANE_DISABLE);
    entity.Render();
}

To actually draw triangles, the process is quite OpenGL like: you set vertex pointer attribs and then (in this case) draw indexed triangles (code from Entity.cpp):

rsxAddressToOffset( &m_meshBuffer->vertices[0].pos, &offset );
rsxBindVertexArrayAttrib( m_context,
                          GCM_VERTEX_ATTRIB_POS,
                          0,
                          offset,
                          sizeof( S3DVertex ),
                          3,
                          GCM_VERTEX_DATA_TYPE_F32,
                          GCM_LOCATION_RSX );

rsxAddressToOffset( &m_meshBuffer->vertices[0].nrm, &offset );
rsxBindVertexArrayAttrib( m_context,
                          GCM_VERTEX_ATTRIB_NORMAL,
                          0,
                          offset,
                          sizeof( S3DVertex ),
                          3,
                          GCM_VERTEX_DATA_TYPE_F32,
                          GCM_LOCATION_RSX );

rsxAddressToOffset( &m_meshBuffer->vertices[0].u, &offset );
rsxBindVertexArrayAttrib( m_context,
                          GCM_VERTEX_ATTRIB_TEX0,
                          0,
                          offset,
                          sizeof( S3DVertex ),
                          2,
                          GCM_VERTEX_DATA_TYPE_F32,
                          GCM_LOCATION_RSX );

rsxAddressToOffset( &m_meshBuffer->indices[0], &offset );
rsxDrawIndexArray( m_context,
                   GCM_TYPE_TRIANGLES,
                   offset,
                   m_meshBuffer->cnt_indices,
                   GCM_INDEX_TYPE_16B,
                   GCM_LOCATION_RSX );

Next, I tested file I/O. This is a luxury compared to PS2 and PS1 - there is a PS3 OS which provides filesystem access for us! At first, I tried to program a test file location relative to the SELF executable directory, but the test.txt file failed to open. I added a DEFINE for the assets directory instead so the absolute path was used:

Makefile:
  ASSETS_DIR  :=  /dev_hdd0/rsxtest/game/assets
  TARGET      :=  game/rsxtest
  ...
  CFLAGS      =   -Wall -mcpu=cell $(MACHDEP) $(INCLUDE) -DASSETS_DIR=\"$(ASSETS_DIR)\"

Code:
 static void fileTest()
  {
      std::string line;
      std::ifstream testFile( ASSETS_DIR"/test.txt" );
      if ( !testFile.is_open() ) {
          DBG_LOG( "Failed to open assets/test.txt" );
          return;
      }
  
      for (; std::getline( testFile, line ); )
      {
          DBG_LOG( line.c_str() );
      }
  }

To access the PS3 filesystem remotely, we can use the FTP server built into the PS3 CFW. I can access the ftp server from my computer by mounting it in a local directory using curlftpfs; then it's just a matter of copying the files over:

curlftpfs ftp://192.168.0.14 ~/ps3_ftp/ # mount the ftp server
cp -r game ~/ps3_ftp/dev_hdd0/rsxtest/

On the ps3 side, to launch the game, we can find the self executable in the multiman explorer and launch it from there (although ps3load should still work, I think...)

05/23/2021

Got an untextured MD2 model working by modifying the MD2 model code from Beginning OpenGL Game Programming.

I first ran into a file endianness issue:

The version field of the MD2 header (a 32bit integer) was being read as 0x08000000 instead of 0x00000008. I forgot that the Ps3 Cell CPU is big endian, whereas my x86_64 desktop is little endian. The MD2 file fields are stored as little endian as well. I chose to handle this on the PS3 by converting the read in file fields from little to big endian during load:

Md2Model.h:

// little endian to big endian
inline int32_t lend2bend32( int32_t lend )
{
    char* ptr = (char*)&lend;
    return (ptr[3] << 24) |
           (ptr[2] << 16) |
           (ptr[1] << 8)  |
           (ptr[0]);
}
inline float lend2bend32f( float lend )
{   
    float res;
    char* ptr = (char*)&lend;
    char* rPtr = (char*)&res;
    rPtr[0] = ptr[3];
    rPtr[1] = ptr[2];
    rPtr[2] = ptr[1];
    rPtr[3] = ptr[0];
    return res;
}
inline int16_t lend2bend16( int16_t lend )
{
    char* ptr = (char*)&lend;
    return (ptr[1] << 8) |
           (ptr[0]);
}

Md2Model.cpp:
...
READ_DATA( lend2bend32( header.skinOffset ), skins, lend2bend32( header.numSkins ), Skin);
READ_DATA( lend2bend32( header.texCoordOffset ), md2TexCoords, lend2bend32( header.numTexCoords )
READ_DATA( lend2bend32( header.triangleOffset ), triangles, lend2bend32( header.numTriangles ), T

for ( TexCoord& tc : md2TexCoords )
{
    tc.s = lend2bend16( tc.s );
    tc.t = lend2bend16( tc.t );
}   
for ( Triangle& tr : triangles )
{
    tr.vertIndex[0] = lend2bend16( tr.vertIndex[0] );
    tr.vertIndex[1] = lend2bend16( tr.vertIndex[1] );
    tr.vertIndex[2] = lend2bend16( tr.vertIndex[2] );
    tr.texCoordIndex[0] = lend2bend16( tr.texCoordIndex[0] );
    tr.texCoordIndex[1] = lend2bend16( tr.texCoordIndex[1] );
    tr.texCoordIndex[2] = lend2bend16( tr.texCoordIndex[2] );
}  
...
// and similar for other locations with int16_t/int32_t/float

I also had to swap the Vec3 class with Vector3 (from the psl1ght library), and define a basic Vector2 struct:

struct KeyFrame
{
    f32 scale[3];     // use to multiply and add Vertex::v
    f32 translate[3];
    char name[16];
    std::vector<Vertex> md2Vertices;
    std::vector<Vector3> vertices; // converted result vertices
};
...
struct Vector2
{
    f32 u,v;
};

Then I replaced the opengl buffering with rsx memory buffering like in the torus/sphere classes:


genBuffers:

m_meshBuffer->cnt_vertices = interpolatedFrame.vertices.size();
m_meshBuffer->vertices =
    (S3DVertex*)rsxMemalign( 128, 
                             m_meshBuffer->cnt_vertices * sizeof(S3DVertex) );
if ( !m_meshBuffer->vertices ) {
    DBG_LOG( "Failed to alloc vertices\n" );
    return false; 
}   
m_meshBuffer->indices = nullptr;
if ( texCoords.size() != interpolatedFrame.vertices.size() ) {
    DBG_LOG( "Tex coords size != vertices size (%lu, %lu)\n",
            texCoords.size(), interpolatedFrame.vertices.size() );
    return false;
}
for ( u32 i = 0; i < m_meshBuffer->cnt_vertices; ++i )
{
    Vector3& pos = interpolatedFrame.vertices[i];
    m_meshBuffer->vertices[i] = S3DVertex( pos.getX(), pos.getY(), pos.getZ(),
                                           0.0f, 1.0f, 0.0f, // TODO - normals
                                           texCoords[i].u, texCoords[i].v );
}
return true;

Update:

...
// Update RSX memory
for ( u32 i = 0; i < m_meshBuffer->cnt_vertices; ++i )
{
    Vector3& pos = interpolatedFrame.vertices[i];
    m_meshBuffer->vertices[i] = S3DVertex( pos.getX(), pos.getY(), pos.getZ(),
                                           0.0f, 1.0f, 0.0f, // TODO - normals
                                           texCoords[i].u, texCoords[i].v );
}

The md2 model also needs to be drawn with triangle arrays instead of indices. I found rsxDrawVertexArray() in the psl1ght doxygen on the same page as rsxDrawIndexArray() used previously, on the commands.h file reference:

Then a slight modification to Entity::Render() to check wether to draw with indices or arrays:

...
if ( m_meshBuffer-&gt;indices != nullptr )
{
    rsxAddressToOffset( &m_meshBuffer-&gt;indices[0], &offset );
    rsxDrawIndexArray( m_context,
                       GCM_TYPE_TRIANGLES,        // u32 type
                       offset,                    // u32 offset
                       m_meshBuffer-&gt;cnt_indices, // u32 count
                       GCM_INDEX_TYPE_16B,        // u32 data_type
                       GCM_LOCATION_RSX );        // u32 location
}
else
{
    rsxDrawVertexArray( m_context,
                        GCM_TYPE_TRIANGLES,
                        0,
                        m_meshBuffer->cnt_vertices );
}

Finally, in order to update the md2's animation, I added a GameTimer class in main.cpp, based off ppu/include/sys/systime.h sysGetCurrentTime()

class GameTimer
{
public:
    GameTimer()
    {   
        GetCurSecNsec( &m_lastSec, &m_lastNsec );
        m_FPS = 0.0f;
    }   
    ~GameTimer()
    {}  
    void Update()
    {   
        u64 curSec;
        u64 curNsec;
        GetCurSecNsec( &curSec, &curNsec );

        u64 secDiff = curSec - m_lastSec;
        u64 nsecDiff = (curNsec < m_lastNsec) ? m_lastNsec - curNsec : curNsec - m_lastNsec;

        m_deltaMs = (float( secDiff ) * 1000.0f) +
                    (float( nsecDiff ) / 1000000.0f);
        m_lastSec = curSec;
        m_lastNsec = curNsec;

        ++m_framesCount;
        m_fpsTimeCount += m_deltaMs;
        if ( m_fpsTimeCount / 1000.0f >= m_fpsDuration )
        {
            m_FPS = ( m_fpsTimeCount / 1000.0f ) * m_framesCount;
            m_framesCount = 0;
            m_fpsTimeCount = 0.0f;
        }
    }
    float GetDeltaMS() { return m_deltaMs; }
    float GetFPS() { return m_FPS; }
private:
    float m_deltaMs; // time passed since last update
    float m_FPS;
    u64 m_lastSec;
    u64 m_lastNsec;
    u32 m_framesCount = 0;
    float m_fpsTimeCount = 0.0f;
    const float m_fpsDuration = 1.0f; // seconds until update fps

    inline void GetCurSecNsec( u64* sec, u64* nsec )
    {
        sysGetCurrentTime( sec, nsec );
    }
};

The code in main to then load and draw the md2 model is:

static Md2Model g_Ogro;
...
    if ( !g_Ogro.Init( g_Renderer.GetGcmContext() ) ) {
        DBG_LOG( "Failed to init md2 model ogro\n" );
        return;
    }
    if ( !g_Ogro.Load( ASSETS_DIR"/models/Ogro/tris.md2" ) ) {
        DBG_LOG( "Failed to load assets/models/Ogro/tris.md2\n" );
        return;
    } else {
        DBG_LOG( "Success loading assets/models/Ogro/tris.md2\n" );
    }
    g_Ogro.SetAnimation( Md2Model::Animation::Idle );
...
g_Timer.Update();
g_Ogro.Update( g_Timer.GetDeltaMS() );
...
modelMatrix = Matrix4::scale( Vector3( 0.25f, 0.25f, 0.25f ) );
modelMatrix.setTranslation(Vector3(3.0f,0.0f,-8.0f));

modelMatrixIT = inverse(modelMatrix);
modelViewMatrix = transpose(viewMatrix*modelMatrix);

objEyePos = modelMatrixIT*eye_pos;

g_Shader.Use();
g_Shader.SetVertexProgParam( "projMatrix", (float*)&g_ProjMatrix );
g_Shader.SetVertexProgParam( "modelViewMatrix", (float*)&modelViewMatrix );

g_Renderer.RenderEntity( g_Ogro );

Which gives us an unproperly textured model:

As a side note, I found you can added files to the rpcs3 emulator by simply placing files in the appropriate directory in the same folder as the emulator executable:

Next was texture loading. Like the Md2Model class, I copied and modified the Targa Image loading implementation from beginning opengl game programming. Like the .md2 model file, the .tga texture needs to be converted from little to big endian. Luckily, there are only 4 fields in the file header that are larger than 1 byte and hence need to be flipped (xOrigin, yOrigin, width, and height):

struct Header
{
    uint8_t idLength;
    uint8_t colorMapType;
    uint8_t imageTypeCode;
    uint8_t colorMapSpec[5];
    uint16_t xOrigin;
    uint16_t yOrigin;
    uint16_t width;
    uint16_t height;
    uint8_t bpp;        // bits per pixel
    uint8_t imageDesc;
};

Other than that I didn't really need to modify the Targa code at all, just remove the OpenGL buffer function. To load the texture data for the PS3 GPU, I made the Targa::Image class a child of the Texture class, and added the Texture::Init function to Targa::Image::Load:


namespace Targa
{
...
class Image : public Texture
{
// internally calls Texture::Init()
bool Load( gcmContextData* context, const std::string& fileName );
...
bool Image::Load( gcmContextData* context, const std::string& fileName )
{
...
    return Texture::Init( context,
                          imageData.data(),
                          width,
                          height,
                          bytesPerPixel );
}

Within the Texture class, I had to modify the init function to account for textures with 3 bytes per pixel (no alpha), which defaults to an alpha value of 255 instead:


bool Texture::Init( gcmContextData* context, u8* pixelData, u32 width, u32 height, u32 bytesPerPixel )
{
...
    m_buffer = (u32*)rsxMemalign( 128, width * height * 4 ); // force 4 bytes per pixel
...
    DBG_ASSERT( bytesPerPixel == 4 || bytesPerPixel == 3 );
    u8* buffer = (u8*)m_buffer;
    for ( u32 i = 0; i < width * height * 4; i += 4 )
    {
        buffer[ i + 1 ] = *pixelData++; // r
        buffer[ i + 2 ] = *pixelData++; // g
        buffer[ i + 3 ] = *pixelData++; // b
        if ( bytesPerPixel == 4 ) {
            buffer[ i + 0 ] = *pixelData++; // a
        } else {
            buffer[ i + 0 ] = 255;
        }
    }
...

Initially the texture wasn't displaying right on the model; it turned out to be a bug where I was converting the Md2 texture coordinates from little to big endian twice; so I was ending up at little endian again. Fixing that we now see a glorious Ogro model:

06/05/2021

Working on getting MD5 Skeletal Animation from Newcastle tutorials on PS3. To start, I replaced the Newcastle Vector3, Vector4, Matrix3, and Matrix4 classes with the PS3 SDK SIMD optimized classes. Additionally, the OpenGL code had to be replaced with the PS3 RSX memory/vertex code. This included adding support for 32-bit indices instead of 16-bit indices, due to the number of vertices required.

DBG_LOG( "Generating RSX buffers\n" );

if ( vertices == nullptr ) {
    DBG_LOG( "vertices null!\n" );
    return false;
}
m_meshBuffer->cnt_vertices = numVertices;
m_meshBuffer->vertices =
    (S3DVertex*)rsxMemalign( 128,
                             m_meshBuffer->cnt_vertices * sizeof(S3DVertex) );
if ( !m_meshBuffer->vertices ) {
    DBG_LOG( "Failed to alloc vertices\n" );
    return false;
}

if ( indices )
{
    m_meshBuffer->cnt_indices = numIndices;
    m_meshBuffer->indices = nullptr;
    m_meshBuffer->indices32 =
        (u32*)rsxMemalign( 128, m_meshBuffer->cnt_indices * sizeof(u32) );
    if ( !m_meshBuffer->indices32 ) {
        DBG_LOG( "Failed to alloc indices32\n" );
        free( m_meshBuffer->vertices );
        return false;
    }
}

for ( u32 i = 0; i < m_meshBuffer->cnt_vertices; ++i )
{
    Vectormath::Aos::Vector3& pos = vertices[i];
    Vector2& texCoord = textureCoords[i];
    m_meshBuffer->vertices[i] = S3DVertex( pos.getX(), pos.getY(), pos.getZ(),
                                           0.0f, 1.0f, 0.0f, // TODO - normals
                                           texCoord.u, 1.0f - texCoord.v );
}

DBG_LOG( "mesh buffer num indices: %u\n", m_meshBuffer->cnt_indices );
for ( u32 i = 0; i < m_meshBuffer->cnt_indices; ++i )
{
    m_meshBuffer->indices32[i] = indices[i];
}
...
if ( m_meshBuffer->indices != nullptr )
{
    rsxAddressToOffset( &m_meshBuffer->indices[0], &offset );
    rsxDrawIndexArray( m_context,
                       GCM_TYPE_TRIANGLES,        // u32 type
                       offset,                    // u32 offset
                       m_meshBuffer-&gt;cnt_indices, // u32 count
                       GCM_INDEX_TYPE_16B,        // u32 data_type
                       GCM_LOCATION_RSX );        // u32 location
}
else if ( m_meshBuffer->indices32 != nullptr )
{
    rsxAddressToOffset( &m_meshBuffer->indices32[0], &offset );
    rsxDrawIndexArray( m_context,
                       GCM_TYPE_TRIANGLES,        // u32 type
                       offset,                    // u32 offset
                       m_meshBuffer->cnt_indices, // u32 count
                       GCM_INDEX_TYPE_32B,        // u32 data_type
                       GCM_LOCATION_RSX );        // u32 location
}

After those changes, I ended up with an initial bug:

Which, after debugging by logging values to text files and comparing between PC version, narrowed down to this code:

Vectormath::Aos::Vector4 res = ((joint.transform * weight.position) * weight.weightValue);
target->vertices[j] += Vectormath::Aos::Vector3( res.getX(), res.getY(), res.getZ() );

weight.position was a Vector3. Because of this, the multiply operator was handling the Matrix4/Vec3 length differently than the newcastle matrix code. Making weight.position a Vector4 instead fixed the issue:

Vectormath::Aos::Vector4 res = (( joint.transform *
                                  Vectormath::Aos::Vector4( weight.position, 1.0f ))
                                  * weight.weightValue );
target->vertices[j] += res.getXYZ();

Another bug:

float oriX, oriY, oriZ;
from >> oriX >> oriY >> oriZ;
baseFrame.orientations[current].setX( oriX ); // bug! should be oriX,oriY,oriZ
baseFrame.orientations[current].setY( oriX );
baseFrame.orientations[current].setZ( oriX );

Better, but the animation is still weird/jerky, and the texture coordinates are still off...

Turned out to be an issue with my game timer class

//g_MD5Node->Update( g_Timer.GetDeltaMS() );
g_MD5Node->Update( 16.0f );

Hard coding the passed milliseconds made the animation nice and smooth as expected

Modified the game timer code to combine both seconds and nanoseconds into just nanoseconds:

//u64 secDiff = curSec - m_lastSec;                                                          
//u64 nsecDiff = (curNsec < m_lastNsec) ? m_lastNsec - curNsec : curNsec - m_lastNsec;
u64 curNano = (curSec * 1e9) + curNsec;
u64 lastNano = (m_lastSec * 1e9) + m_lastNsec;

//m_deltaMs = (float( secDiff ) * 1000.0f) +
//            (float( nsecDiff ) / 1000000.0f);
m_deltaMs = float((curNano - lastNano) / 1000000);

Next, to fix the texture, I had to flip the v coordinate, like is done in the Md2 model class:

for ( u32 i = 0; i < m_meshBuffer->cnt_vertices; ++i )                                           
{                                                                                                
    Vectormath::Aos::Vector3& pos = vertices[i];                                                 
    Vector2& texCoord = textureCoords[i];
    m_meshBuffer->vertices[i] = S3DVertex( pos.getX(), pos.getY(), pos.getZ(),                   
                                              0.0f, 1.0f, 0.0f, // TODO - normals                   
                                              texCoord.u, 1.0f - texCoord.v ); // flipped v to 1.0f - v
}

And wallah! Fully textured and animated working on the console:

I loaded the proper texture outside of the MD5 class; TODO would be to auto load the texture based on the file name within the MD5.

06/06/2021

Working on the heightmap terrain from Begginning Opengl Game Programming. Similar to the MD5 model, I replaced Vector/Matrix classes with the ps3 Vectormath::Aos::Vector3 versions. I also moved the texturing outside of the class (into main for now).

The scale of the terrain compared to the MD5 model is quite different; I'll have to either scale down the MD5 model or scale up the terrain.

I initially had a texturing issue:

Which viewing the zoomed out version, showed the texture was not repeating over each terrain section. Looking at the texture class code properties, I saw the texture was configured to clamp instead of wrap, which matches viewing the zoomed out terrain.

Looking at the PSL1GHT doxygen, we can see the different texture options:

I chose GCM_TEXTURE_REPEAT to use instead of GCM_TEXTURE_CLAMP:

//rsxTextureWrapMode(m_context,shaderTexUnit,GCM_TEXTURE_CLAMP_TO_EDGE,GCM_TEXTURE_CLAMP_TO_EDGE,GCM_TEXTURE_CLAMP_TO_EDGE,0,GCM_TEXTURE_ZFUNC_LESS,0);
rsxTextureWrapMode(m_context,shaderTexUnit,GCM_TEXTURE_REPEAT,GCM_TEXTURE_REPEAT,GCM_TEXTURE_REPEAT,0,GCM_TEXTURE_ZFUNC_LESS,0);

Scaling the terrain by a factor of 20 we can now see the ground in relation to the MD5 model:

However, when I tried running it on my PS3 it didn't turn out so well:

I guessed it had to do with the massive scaling and size difference between the terrain and MD5 model. After adding a scale option to the MD5 loading code and scaling it down to 1/20th its original size, and then adjusting the camera, running on PS3 hardware worked again:

06/07/2021

Got transparent texture for the tree object from beg. opengl game prog. working. I made a static S3DVertex class member pointer, that way all Tree instances can share the same vertex memory. I also changed the vertex/texture coordinate buffers to use triangles instead of triangle strips, just because I was lazy and didn't want to figure out how to do that on the ps3 (I think an additional 2 triangles won't kill us!).

// static buffer instance
S3DVertex* Tree::s_vertices = nullptr;
...
static const size_t numVertices = 12;
m_meshBuffer->cnt_vertices = numVertices;
if ( s_vertices == nullptr )
{
    float vertices[numVertices*3] = 
    {
        // first square
        -1.0f, 0.0f, 0.0f, // bottom left
        1.0f,  0.0f, 0.0f, // bottom right
        -1.0f, 2.0f, 0.0f, // top left
        1.0f,  0.0f, 0.0f, // bottom right
        1.0f,  2.0f, 0.0f, // top right
        -1.0f, 2.0f, 0.0f, // top left

        // second square
        0.0f, 0.0f,  1.0f, // bottom left
        0.0f, 0.0f, -1.0f, // bottom right
        0.0f, 2.0f,  1.0f, // top left
        0.0f, 0.0f, -1.0f, // bottom right
        0.0f, 2.0f, -1.0f, // top right
        0.0f, 2.0f,  1.0f // top left
        
    };

    float texCoords[numVertices*2] =
    {
        // first square
        0.0f, 0.0f, // bottom left
        1.0f, 0.0f, // bottom right
        0.0f, 1.0f, // top left
        1.0f, 0.0f, // bottom right
        1.0f, 1.0f, // top right
        0.0f, 1.0f, // top left

        // second square
        0.0f, 0.0f, // bottom left
        1.0f, 0.0f, // bottom right
        0.0f, 1.0f, // top left
        1.0f, 0.0f, // bottom right
        1.0f, 1.0f, // top right
        0.0f, 1.0f, // top left
    };

    s_vertices = (S3DVertex*)rsxMemalign( 128,
                                          m_meshBuffer->cnt_vertices * sizeof(S3DVertex) );
    for ( size_t i = 0; i < numVertices; ++i )
    {
        float* vertex = &vertices[i*3];
        float* texCoord = &texCoords[i*2];
        s_vertices[i] = S3DVertex( vertex[0], vertex[1], vertex[2],
                                    0.0f, 1.0f, 0.0f, // TODO - normals
                                    texCoord[0], texCoord[1] );
    }
}
m_meshBuffer->vertices = s_vertices;
m_meshBuffer->cnt_indices = 0;
m_meshBuffer->indices = nullptr;
m_meshBuffer->indices32 = nullptr;

Next, I updated fragment shader to allow alpha values:

//float3 color = tex2D(texture, texcoord).xyz; // old
float4 color = tex2D(texture, texcoord); // new
    
//oColor = float4(color,1.0f); // old
oColor = color; // new

Yet the tree texture was showing black where it should be transparent:

Hard coding red and green values in the shader showed that indeed worked:

After some poking around in the psl1ght doxygen, I found shader blending was disabled by default. I added the following calls to Renderer::SetDrawEnv(). I was surprised rsxSetBlendFunc() needed to be called explicitly as well, I assumed some default would have worked instead.

rsxSetBlendEnable( context, GCM_TRUE ); // enable blending
rsxSetBlendEquation( context, GCM_FUNC_ADD, GCM_FUNC_ADD ); // default blend equation...
rsxSetBlendFunc( context, GCM_SRC_ALPHA, // sfcolor (source)
                          GCM_ONE_MINUS_SRC_ALPHA, // dfcolor (destination)
                          GCM_SRC_ALPHA, // sfalpha
                          GCM_ONE_MINUS_SRC_ALPHA ); // dfalpha

Which is called every Renderer::BeginFrame(). Now we have a transparent texture for the tree as expected:

06/21/2021

Added an orbit camera class based on "Game Programming in C++" by Sanjay Madhav, chapter 9

Surprisingly, it worked basically on the first try, after replacing Madhav's Vector/Quaternion code with the PSL1GHT library code:

void OrbitCamera::Update( const float dt )
{
    // yaw quaternion about world up
    Vectormath::Aos::Quat yaw = Vectormath::Aos::Quat::rotationY(
        m_yawSpeed * dt );
 
    // transform offset and up by yaw
    m_offset = Vectormath::Aos::rotate( yaw, m_offset );
    m_up = Vectormath::Aos::rotate( yaw, m_up );
    
    // Compute camera forward and right
    // forward owner.position - (owner.position + offset)
    // = -offset
    Vectormath::Aos::Vector3 forward = -1.0f * m_offset;
    forward = normalize( forward );
    Vectormath::Aos::Vector3 right = Vectormath::Aos::cross( m_up, forward );
    right = Vectormath::Aos::normalize( right );
    
    // Create quaternion for pitch about camera right
    Vectormath::Aos::Quat pitch = Vectormath::Aos::Quat::rotation(
        m_pitchSpeed * dt, right );
    // transform offset and up by pitch
    m_offset = Vectormath::Aos::rotate( pitch, m_offset );
    m_up = Vectormath::Aos::rotate( pitch, m_up );
    
    // Create the lookat matrix
    m_viewMat = Vectormath::Aos::Matrix4::lookAt(
        Vectormath::Aos::Point3( m_target + m_offset ), // camera position
        Vectormath::Aos::Point3( m_target ), // camera target/look at
        m_up ); // up vector
}

Then, in main, I replaced the viewMatrix with the Camera GetViewMatrix() function. TODO is also to move the entity positions within the entity class (for the model matrix):

static OrbitCamera g_Camera;
...
modelViewMatrix = transpose(g_Camera.GetViewMatrix()*modelMatrix);
...
g_Camera.SetTarget( g_MD5Position );
g_Camera.Update( g_Timer.GetDeltaMS() );
//g_Camera.SetPitchSpeed( 0.0005f ); // rotate the camera
g_Camera.SetYawSpeed( 0.0005f );

TODO is to move the camera with the controller, but testing shows valid rotation around our MD5 model target:

06/22/2021

Got model rotation/forward direction working. Added yaw, pitch, position, forward speed, and a model matrix to the Entity class:

Vectormath::Aos::Matrix4 m_modelMat;
Vectormath::Aos::Vector3 m_position;
float m_yawSpeed; // yaw speed in radians/sec
float m_yaw; // yaw in radians
float m_yawRenderOffset; // additional yaw to add before rendering

float m_pitchSpeed; // pitch speed in radians/sec
float m_pitch; // pitch in radians

float m_forwardSpeed; // added to m_position based on calculated forward vec

Then Update() recalculates the model matrix (based on Madhav mentioned yesterday). It's currently limited to yaw rotation about the Y axis.

void Entity::Update( float dt )
{
    m_yaw += m_yawSpeed * dt;

    // TODO - account for pitch in forward as well
    // default forward points right; along x axis
    Vectormath::Aos::Vector3 forward( 1.0f, 0.0f, 0.0f );
    // rotate the right-pointing forward about Y axis based on yaw
    Vectormath::Aos::Quat yaw = Vectormath::Aos::Quat::rotationY( m_yaw );
    forward = Vectormath::Aos::rotate( yaw, forward );

    // move forward
    m_position = m_position + (forward * m_forwardSpeed * dt);

    // TODO - 
    // This doesn't allow for pitch or roll rotation...
    m_modelMat = Vectormath::Aos::Matrix4(
        Vectormath::Aos::Transform3(
            Vectormath::Aos::Quat::rotationY( m_yaw ),
            m_position ));
}

Next, added controller input to move/rotate the entity (Filter2D again based on Madhav, except scaled controller values from -128 to 127 instead of -32768 32767:

// input range 0-255, middle = 127
Vector2 Filter2D( int inputX, int inputY )
{
    const float deadZone = (8000.0f/32768.0f) * 127.0f;
    const float maxValue = (30000.0f/32768.0f) * 127.0f;

    Vector2 dir;
    dir.x = (float)(inputX) - 127.0f;
    dir.y = (float)(inputY) - 127.0f;

    float length = dir.Length();

    if ( length < deadZone )
    {
        dir.x = 0.0f;
        dir.y = 0.0f;
    }
    else
    {
        // calculate fractional between dead zone and max value circles
        float f = ( length - deadZone ) / ( maxValue - deadZone );

        // clamp between 0 and 1
        if ( f < 0.0f ) { f = 0.0f; }
        if ( f > 1.0f ) { f = 1.0f; }

        // normalize vector and scale it to the fractional value
        dir.x *= f / length;
        dir.y *= f / length;
    }

    return dir;
}
...
// main loop
Vector2 analogDir = Filter2D( g_Pad.GetLeftAnalogX(),
                               g_Pad.GetLeftAnalogY() );
float dirLen = analogDir.Length();
g_MD5Node->SetForwardSpeed( -dirLen * 0.005f );
static bool isIdle = false;
if ( dirLen > 0.000001f ) {
    float angle = atan2( -analogDir.y, analogDir.x );
    g_MD5Node->SetYaw( angle );
}

g_MD5Node->Update( g_Timer.GetDeltaMS() );
g_Camera.SetTarget( g_MD5Node->GetPosition() );
g_Camera.Update( g_Timer.GetDeltaMS() );

Then confirmed that we can rotate the entity and move it around:

06/23/2021

Added the walking animation to the controlled MD5 Model, which turned out to be way harder than I thought it would.

The first step was simple; load and play a different .md5anim file instead of the idle anim:

EXIT_ON_FAIL( g_MD5Data.AddAnim( ASSETS_DIR"/models/hellknight/idle2.md5anim" ),
              "Failed to load md5 anim\n",
              "Sucess loading md5 anim\n" );
EXIT_ON_FAIL( g_MD5Data.AddAnim( ASSETS_DIR"/models/hellknight/walk7.md5anim" ),
              "Failed to load md5 anim\n",
              "Sucess loading md5 anim\n" );
//g_MD5Node->PlayAnim( ASSETS_DIR"/models/hellknight/idle2.md5anim" );
g_MD5Node->PlayAnim( ASSETS_DIR"/models/hellknight/walk7.md5anim" );

But that causes the model to move forward during its anim sequence:

Based on this post, to try and keep the model from moving forward during its walk cycle, I tried subtracting the interpolated current and next root joint's anim transform from the model position, but the result was very jerky/not smooth:

// MD5Anim.cpp
skelJoint.transform = MD5FileData::conversionMatrix * skelJoint.localTransform;
curRootTransform = skelJoint.transform;
Vectormath::Aos::Matrix4 nextLocalTransform;
Vectormath::Aos::Transform3 nextT3( animQuat, nextAnimPos ); // don't care about rotation, only using position
nextLocalTransform = Vectormath::Aos::Matrix4( nextT3 );
nextRootTransform = MD5FileData::conversionMatrix * nextLocalTransform;
...
// MD5Node.cpp
Vectormath::Aos::Matrix4& rootMat = currentAnim->GetCurRootTransform();
Vectormath::Aos::Matrix4& nextRootMat = currentAnim->GetNextRootTransform();
Vectormath::Aos::Vector3 offset = rootMat.getTranslation();
Vectormath::Aos::Vector3 nextOffset = nextRootMat.getTranslation();

// LERP between the offsets
if ( currentAnim->GetFrameRate() < 0.000001f ) {
    //DBG_LOG( "WARN: current anim frame rate 0.0f\n" );
    return Vectormath::Aos::Vector3( 0.0f, 0.0f, 0.0f );
}
// frameTime is in milliseconds not seconds, so frame rate scaled by 1000
float t = frameTime / ( 1000.0f / currentAnim->GetFrameRate() );
if ( t < 0.0f || t > 1.0f ) {
    t = 0.0f;
    DBG_LOG( "ERROR - bad t (frameTime, frameRate: %f, %f )\n", frameTime, currentAnim->GetFrameRate() );
}
offset = Vectormath::Aos::lerp( 1.0f - t, offset, nextOffset );
return offset;
...
// Entity.cpp
// TODO - account for pitch in forward as well
// default forward points right; along x axis
Vectormath::Aos::Vector3 forward( 1.0f, 0.0f, 0.0f );
Vectormath::Aos::Vector3 moveOffset( GetMovingAnimOffset() );
// rotate the right-pointing forward about Y axis based on yaw
Vectormath::Aos::Quat yaw = Vectormath::Aos::Quat::rotationY( m_yaw );
forward = Vectormath::Aos::rotate( yaw, forward );
moveOffset = Vectormath::Aos::rotate( yaw, moveOffset );

// move forward
m_position = m_position + (forward * m_forwardSpeed * dt);

// TODO
// This doesn't allow for pitch or roll rotation...
m_modelMat = Vectormath::Aos::Matrix4(
    Vectormath::Aos::Transform3(
        Vectormath::Aos::Quat::rotationY( m_yaw ),
        m_position - moveOffset )); // offset position by animation offset

Again, this didn't work well. I next tried setting the root joint animation matrix translation component directly to zero, which worked like a charm! As well as being much simpler and less expensive

// MD5Anim.cpp
...
skelJoint.transform = MD5FileData::conversionMatrix * skelJoint.localTransform;
Vectormath::Aos::Vector3 transVec( 0.0f, 0.0f, 0.0f ); // don't move the model's position at all please
skelJoint.transform.setTranslation( transVec );

So now we can move our model directly with the controller and trigger the walking animation:

07/25/2021

Working on porting Sanjay Madhav's Game Programming book to PS3, starting with Chapter 8 (Input system and asteroids game)

PSL1GHT comes with a port SDL 1.2, not 2.0, so the first thing was converting the relevant code. It's mostly using SDL_Surfaces for everything instead of SDL_Renderer and SDL_Window. Additionally, the chapter uses OpenGL to draw everything in 2D. I'm hoping just for this chapter's code to just use SDL and worry about PS3 RSX programming later. However, one problem I need to fix currently is that SDL 1.2 doesn't include the ability to rotate surfaces/sprites, so I'll either have to do that manually to the surface pixels or suck it up and program with the rsx/shaders so I can do rotation that way.

So far I have the ship sprite drawn (using SDL_image) and input working. I had an issue at first running on PS3 hardware that RPCS3 emulator didn't catch, related to SDL initialization:


// This code freezes on PS3 hardware
if (SDL_Init(SDL_INIT_VIDEO|SDL_INIT_AUDIO|SDL_INIT_JOYSTICK|SDL_INIT_TIMER) != 0)
{
    SDL_Log("Unable to initialize SDL: %s", SDL_GetError());
    return false;
}
mWindow = SDL_SetVideoMode(1024, 768, 32, SDL_HWSURFACE);
if (!mWindow) {
    SDL_Log("Failed to set video mode: %s\n", SDL_GetError());
    return false;
}

// This code works/doesn't freeze
if (SDL_Init(SDL_INIT_VIDEO|SDL_INIT_JOYSTICK) != 0)
{
    SDL_Log("Unable to initialize SDL: %s", SDL_GetError());
    return false;
}

mWindow = SDL_SetVideoMode(1024, 768, 0, SDL_FULLSCREEN);
if (!mWindow) {
    SDL_Log("Failed to set video mode: %s\n", SDL_GetError());
    return false;
}

This was based on looking at kernow5000s , which looking at it again now I noticed doesn't actually target the PS3. On github at least through searching, I don't find many results for "sdl ps3."

Currently, after creating a few lazer objects the PS3 freezes as well. This happens on both the rpcs3 emulator and hardware:

if (input.GetMappedButtonState("Fire") == EHeld
    && mLaserCooldown <= 0.0f)
{
    // this commented out code causes freeze after a few calls...
    // Create a laser and set its position/rotation to mine
    //Laser* laser = new Laser(GetGame());
    //laser->SetPosition(GetPosition());
    //laser->SetRotation(GetRotation());
    //SDL_Log("Ship fire");

    // Reset laser cooldown (quarter second)
    mLaserCooldown = 0.25f;
}

Which I'm assuming is related to dynamic memory allocation. That's a TODO for later. Next, SDL 1.2 doesn't have direct button mappings for the PS3 controller, you just specify which button via an integer, so I added some defines after testing which integer matches what button:

// How ps3 controller buttons are mapped to SDL buttons
#define PS3_BUTTON_LEFT     0
#define PS3_BUTTON_DOWN     1
#define PS3_BUTTON_RIGHT    2
#define PS3_BUTTON_UP       3
#define PS3_BUTTON_START    4
#define PS3_BUTTON_RANALOG  5
#define PS3_BUTTON_LANALOG  6
#define PS3_BUTTON_SELECT   7
#define PS3_BUTTON_SQUARE   8
#define PS3_BUTTON_CROSS    9
#define PS3_BUTTON_CIRCLE   10
#define PS3_BUTTON_TRIANGLE 11
#define PS3_BUTTON_R1       12
#define PS3_BUTTON_L1       13
#define PS3_BUTTON_R2       14
#define PS3_BUTTON_L2       15

Finally, just a note it seems I can't get variable L2/R2 trigger values, they are just mapped as buttons. It may be worth ditching SDL for input and just using PSL1GHT directly in this case...

The current progress of the chapter8 code looks like this:

07/27/2021

Basically finished chapter 8, the only thing missing is sprite rotation, which I'm not sure I'll bother with, sense I wanted to focus on 3d next anyways.

The issue I had with Lasers crashing the code turned out to be a bug I had when re-introducing the Actors back into the game class:

// Bugged code I wrote - deleting an entry from mActors instead of mPendingActors,
// causing invalid memory access!
void Game::RemoveActor(Actor* actor)
{
    // Is it in pending actors?
    auto iter = std::find(mPendingActors.begin(), mPendingActors.end(), actor);
    if (iter != mPendingActors.end())
    {
        // Swap to end of vector and pop off (avoid rease copies)
        std::iter_swap(iter, mActors.end() - 1);
        mActors.pop_back();
    }
}

// Proper version, fixed the bug:
void Game::RemoveActor(Actor* actor)
{
    // Is it in pending actors?
    auto iter = std::find(mPendingActors.begin(), mPendingActors.end(), actor);
    if (iter != mPendingActors.end())
    {
        // Swap to end of vector and pop off (avoid erase copies)
        std::iter_swap(iter, mPendingActors.end() - 1);
        mPendingActors.pop_back();
    }

    // Is it in actors?
    iter = std::find(mActors.begin(), mActors.end(), actor);
    if (iter != mActors.end())
    {
        // Swap to end of vector and pop off (avoid erase copies)
        std::iter_swap(iter, mActors.end() - 1);
        mActors.pop_back();
    }
}

After that, I had to modify the MoveComponent class screen position calculation from OpenGL coordinates where the center of the screen is 0,0 to SDL coordinates where the top left of the screen is 0,0:

// Old OpenGL coordinates code:
// Screen wrapping (for asteroids)
if (pos.x < -512.0f) { pos.x = 510.0f; }
else if (pos.x > 512.0f) { pos.x = -510.0f; }
if (pos.y < -384.0f) { pos.y = 382.0f; }
else if (pos.y > 384.0f) { pos.y = -382.0f; }
mOwner->SetPosition(pos);

// New SDL coordinates version:
float sWidth = (float)mOwner->GetGame()->GetScreenWidth();
float sHeight = (float)mOwner->GetGame()->GetScreenHeight();

// Screen wrapping (for asteroids)
if (pos.x < 0.0f) { pos.x = sWidth-1; }
else if (pos.x > sWidth) { pos.x = 1.0f; }
if (pos.y < 0.0f) { pos.y = sHeight-1; }
else if (pos.y > sHeight) { pos.y = 1.0f; }
mOwner->SetPosition(pos);

Now, the code acts like the original book version except a) the sprites don't rotate and b) only a square in the center of the screen is used; assuming that's because I hard-coded the window resolution to 1024x768:

I now have a git repo for the code here

07/28/2021

Working on changing SDL rendering to rsx 3D rendering so we can a) use the full screen and b) rotate sprites. It'll also be needed for the further chapters anyways to render 3D.

Currently have the Renderer class, Shader class seemingly working (techincally shader hasn't been tested yet). Renderer is initialized in Game class and clears the screen white.

Started texture class, not finished. Planning to use SDL_image to load the texture data to upload to the rsx as a texture, in place of SOIL in the book code.

Also todo, implement the VertexArray class to have a PS3 version of that, as opposed to storing the vertex data in an object in the game class.

07/30/2021

Got the 3D/shader version of chapter 08 fully working now.

I had the most issue with matrices and shader transformations. The book uses (I think) a left-handed coordinate system and stores the matrices in row-major order which is different than the OpenGL centric manner I'm used to seeing (right-handed coordinates and column major order). The book tells OpenGL to transpose the matrices as they are sent to the shader (GL_TRUE):

// book code
void Shader::SetMatrixUniform(const char* name, const Matrix4& matrix)
{
    // Find the uniform by this name
    GLuint loc = glGetUniformLocation(mShaderProgram, name);
    // Send the matrix data to the uniform
    // GL_TRUE - "For the matrix commands, specifies whether to transpose the matrix as the values are loaded into the uniform variable."
    glUniformMatrix4fv(loc, 1, GL_TRUE, matrix.GetAsFloatPtr());
}

Then in the vertex shader, the multiplcation order is backwards compared to what I've seen previously:

// Transform position to world space, then clip space
gl_Position = pos * uWorldTransform * uViewProj;

The question then is how does this translate to the PSX/rsx? After some experimentation, the following works for me; however, I hard-coded the z value to zero otherwise it was freaking out:

ePosition = mul(mul(transpose(uViewProj),transpose(uWorldTransform)),pos);
ePosition.z = 0.0;

With this however, the sprite rotations were backwards and the y-coordinate was backwards also. To work around this, I negated the actor rotation and y-position when calculating the world matrix:

// Scale, then rotate, then translate
mWorldTransform = Matrix4::CreateScale(mScale);
mWorldTransform *= Matrix4::CreateRotationZ(-mRotation); // negated rotation and y position below
mWorldTransform *= Matrix4::CreateTranslation(Vector3(mPosition.x, -mPosition.y, 0.0f));

I'll probably end up replacing the book's matrix/vector classes with the psl1ght provided versions to use hardware accelerated computations anyways for the next chapters. We now have a version that uses the entire available screen space and hardware accelerated drawing (Also my first time converting a video to webm below; it's only 178KB! Seems to be a good alternative to gifs...)

Recorded from rpcs3: (I confirmed it works on hardware too)

08/07/2021

Testing out what I have implemented to far from Chapter 9. First test is .gpmesh and VertexArray to verify those can be rendered correctly. I'm doing this by adding them to my previous rsxtest program sense I know that works.

First bug was in the vertex array class - I wasn't allocating enough memory for the indices because the rsxMemalign call was allocating for 16bit indices instead of 32:

// Oops! should be below instead
mMeshBuffer.indices32 = (u32*)rsxMemalign( 128, mMeshBuffer.cnt_indices * sizeof(u16) );

// That's better
mMeshBuffer.indices32 = (u32*)rsxMemalign( 128, mMeshBuffer.cnt_indices * sizeof(u32) );

After that, I was accessing VertexArray mMeshBuffer indices out of bounds due to an index calculation mix-match:

// vertices[i] results in invalid vertices indexes
for (unsigned int i=0; i < mMeshBuffer.cnt_vertices*8; i += 8) {
    mMeshBuffer.vertices[i] = S3DVertex( verts[i+0], verts[i+1], verts[i+2], // pos
                                         verts[i+3], verts[i+4], verts[i+5], // normal
                                         verts[i+6], verts[i+7] ); // tex coord
}

// should have been this
for (unsigned int i=0,j=0; i < mMeshBuffer.cnt_vertices*8; i += 8,++j) {
    mMeshBuffer.vertices[j] = S3DVertex( verts[i+0], verts[i+1], verts[i+2], // pos
                                         verts[i+3], verts[i+4], verts[i+5], // normal
                                         verts[i+6], verts[i+7] ); // tex coord
}

Once that was fixed the gpmesh file (a sphere) was rendered successfully, although it is still missing its proper texture (it's using the tree's texture in this image):

08/12/2021

Got the chapter 9 sprite component code working (at least in the rsx test sandbox I've been using). My first step was to recreate the 'CreateSimpleViewProj' function to use the Vectormath matrix class. I got the source code for the psl1ght perspective Matrix4 function (here) and then modified it to match Madhav's code:

inline Vectormath::Aos::Matrix4 CreateSimpleViewProj(float width, float height)
{
    float f, rangeInv;
    vec_float4 zero, col0, col1, col2, col3;
    union { vec_float4 v; float s[4]; } tmp;
    zero = ((vec_float4){0.0f,0.0f,0.0f,0.0f});
    tmp.v = zero;
    tmp.s[0] =  2.0f / width;
    col0 = tmp.v;
    tmp.v = zero;
    tmp.s[1] = 2.0f / height;
    col1 = tmp.v;
    tmp.v = zero;
    tmp.s[2] = 1.0f;
    col2 = tmp.v;
    tmp.v = zero;
    tmp.s[2] = 1.0f;
    tmp.s[3] = 1.0f;
    col3 = tmp.v;
    return Vectormath::Aos::Matrix4(
        Vectormath::Aos::Vector4( col0 ),
        Vectormath::Aos::Vector4( col1 ),
        Vectormath::Aos::Vector4( col2 ),
        Vectormath::Aos::Vector4( col3 )
    );
}

Next, I made a psl1ght version of the sprite vertex/fragment shaders:

// Sprite.vcg
void main
(
    float3 inPosition : POSITION,
    float3 inNormal : NORMAL,
    float2 inTexCoord : TEXCOORD0,
    
    uniform float4x4 uWorldTransform,
    uniform float4x4 uViewProj,
    
    out float4 ePosition : POSITION,
    out float4 oPosition : TEXCOORD0,
    out float3 oNormal : TEXCOORD1,
    out float2 oTexcoord : TEXCOORD2
)
{
    float4 pos = float4(inPosition,1.0f);
    ePosition = mul(mul(uViewProj,uWorldTransform),pos);
    
    oPosition = pos;
    oNormal = inNormal;
    oTexcoord = inTexCoord;
}

// Sprite.fcg
void main
(
    float4 position : TEXCOORD0,
    float3 normal : TEXCOORD1,
    float2 fragTexCoord : TEXCOORD2,
    
    uniform sampler2D uTexture,
    
    out float4 oColor
)
{
    oColor = tex2D(uTexture, fragTexCoord);
}

Note you have to invert the Vectormath matrices before sending them to the shader; Madhav's code does this automatically by specifying GL_TRUE as the transpose argument to glUniformMatrix4fv. The final world matrix code for the sprite drawing looks like:

Vectormath::Aos::Matrix4 parentWorld = Vectormath::Aos::Matrix4::translation( mPosition );
Vectormath::Aos::Matrix4 world = parentWorld * scaleMat;
world = Vectormath::Aos::transpose(world);
shader->SetVertexProgParam( "uWorldTransform", (float*)&world );

During debugging I added a quick helper function to print out the contents of a matrix, which works the same for both the original Chapter 9 code and the ps3 version, where you just print the matrix object as an array of floats:

inline void PrintMat4( Vectormath::Aos::Matrix4& mat )
{
    float* ptr = (float*)&mat;
    for (int i=0; i<4; ++i)
    {
        for (int j=0; j<4; ++j)
        {
            printf("%08.08f ", *ptr++);                               
        }
        printf("\n");
    }
    printf("\n");
}

That way I could compare the matrix contents of both programs side by side:

A final note is that disabling depth test is required for drawing the sprites so they are always drawn on top:

//Renderer.h
void SetDepthTest(const bool isEnabled) {
    rsxSetDepthTestEnable(m_context, isEnabled ? GCM_TRUE : GCM_FALSE);
}
//SpriteComponent.cpp
renderer->SetDepthTest(false);
renderer->RenderSpriteVertexArray();
renderer->SetDepthTest(true);

The result is that we can see the healthbar image from Chapter9:

08/13/2021

Made more progress on Chapter9 drawing; things I think are mostly being drawn propely now, but mirrored or something.

Notes:

Matrices do indeed need to be transposed before they are sent to the shader
Matrix and Quaternion multiplication needs to happen from right-to-left like you would see in OpenGL shaders
Quaternion initialization with a rotation and vector needs to be initialized with the function Vectormath::Aos::Quat::rotation(), as opposed to a quat constructor (the quat constructor just sets the x,y,z,w values)

Running on actual hardware however causes the rendering to freak out like the rsxsample test above when objects are too close to the camera (i.e. the rifle). Removing the rifle is better (everything is rendered properly) but the FPS still tanks. Not sure why that's happening... potentially related to culling/clipping perhaps?

08/14/2021

Got RSXGL working with my current toolchain version (gcc 7!). Found a thread on psx-place: link where user Crystal had tested a version pre-compiled with gcc 4 against a newer version of the ps3dev toolchain (gcc 7). I got it mostly working with the following steps and changes, and pushed the changes to my forked repo: https://github.com/williamblair/RSXGL

git clone https://github.com/crystalct/RSXGL.git
cd RSXGL
./autogen.sh
./configure
make -f mk.inst
cd rsxgl_gear_sample

Edit makefile eigen include dirs to:
export INCLUDE  :=  $(foreach dir,$(INCLUDES), -I$(CURDIR)/$(dir)) \
                    $(foreach dir,$(LIBDIRS),-I$(dir)/include) \
                    $(LIBPSL1GHT_INC) -I$(CURDIR)/$(BUILD) \
                    -I$(PORTLIBS)/include -I$(CURDIR)/../extsrc/eigen/Eigen \
                    -I$(CURDIR)/../extsrc/eigen

Remove eigen 2 support cflag:
#CFLAGS      =   -O2 -Wall -mcpu=cell $(MACHDEP) $(INCLUDE) -D__RSX__ -DEIGEN2_SUPPORT
CFLAGS      =   -O2 -Wall -mcpu=cell $(MACHDEP) $(INCLUDE) -D__RSX__

Add -std=c++11 to CXX flags to get rid of std::round and std::log1p error:
CXXFLAGS    =   $(CFLAGS) -std=c++11

make

rsx_gl_gear_sample.self should how exist and work

cd ../src/samples/rsxglgears
make
fself.py rsxglgears.elf rsxglgears.self
rsxglgears.self should now exist and work

cd ../rsxgltest
make
fself.py rsxgltest.elf rsxgltest.self

The result .self for glgears and rsxgltest worked! strangely the top level rsx_gl_cube_test didn't work though, I think it's related to the shader compilation as I had to modify the Makefile because cgc didn't exist in the location the Makefile thought it should

08/15/2021

Doing some experimentation with RSXGL; had a weird issue with a Phong fragment shader. If statement checking NdotL caused shader to not work (if tested against 0, failed to compile; if tested against 0.0, output pixel color always black). The solution was to remove the if statement altogether and use the clamp() function to force the value between 0.0 and 1.0.

// doesn't work
vec3 Phong = uAmbientLight;
float NdotL = dot(N, L);
if (NdotL > 0)
{
    vec3 Diffuse = uDirLight.mDiffuseColor * NdotL;
    vec3 Specular = uDirLight.mSpecColor * pow(max(0.0, dot(R, V)), uSpecPower);
    Phong += Diffuse + Specular;
}
// works
vec3 Phong = uAmbientLight;
float NdotL = dot(N, L);
vec3 Diffuse = clamp(uDirLight.mDiffuseColor * NdotL, 0.0f, 1.0f);
vec3 Specular = clamp( uDirLight.mSpecColor * 
                           pow(max(0.0f, dot(R,V)), uSpecPower), 
                      0.0f, 1.0f );
Phong += Diffuse + Specular;
gl_FragColor = texture2D(uTexture, fragTexCoord) * vec4(Phong, 1.0f);

For glsl, you have to use attributes and varyings instead of in/out syntax, and texture2D instead of texture, e.g.

// use this
attribute vec3 inPosition;
varying vec2 fragTexCoord;
out = texture2D(uTexture, fragTexCoord);
// instead of
in vec3 inPosition;
out vec2 fragTexCoord;
out = texture(uTexture, fragTexCoord);

08/19/2021

All of the objects from Chapter9 are drawing properly now. Had to use just vertex buffer objects instead of vertex array objects, presumably because the PS3 port is OpenGL ES:

// VertexArray.cpp
void VertexArray::SetActive()
{
    // Use our previously created vertex buffer and index buffer
    // (instead of binding single vertex array object above)
    glBindBuffer(GL_ARRAY_BUFFER, mVertexBuffer);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, mIndexBuffer);

    // Specify the vertex attributes
    // (For now, assume one vertex format)
    // Position is 3 floats
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 8 * sizeof(float), 0);
    // Normal is 3 floats
    glEnableVertexAttribArray(1);
    glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 8 * sizeof(float),
        reinterpret_cast(sizeof(float) * 3));
    // Texture coordinates is 2 floats
    glEnableVertexAttribArray(2);
    glVertexAttribPointer(2, 2, GL_FLOAT, GL_FALSE, 8 * sizeof(float),
        reinterpret_cast(sizeof(float) * 6));
}

Additionally, I'm having a lag issue on PS3 hardware. I had the same lag using the RPCS3 emulator, but adding the renderer frame timing synchronization fixed it. It still lags on the PS3 hardware though. By lag I mean the input from the controller is handled instantaneously, but the results on the screen seem to be about half of a second behind. TODO is add timing calculations to the code and see if it's just a speed issue. Immediately I'm thinking replacing the book's Math/Matrix/Quaternion code with the PSL1GHT library's version, which I'm assuming is vector optimized. Also, it might be a good chance to experiment with SPU programming to do some parallel processing if necessary.

08/22/2021

Building assimp for psl1ght:
referenced https://github.com/gzorin/RSXGL/commit/96feeea6e58ad50c6479eb4b813bd4d06babee8f

git clone git://github.com/assimp/assimp.git assimp
cd assimp
    assimp commit is d2b7e9c38c4f33b9ac96a863c94475c03cdf056d (Aug 20 2021, merge pull request from #4040 from assimp/build_fixes)
git apply ../bjpsl1ghtassimp.patch
    gives warning:
    ../bjpsl1ghtassimp.patch:259: new blank line at EOF.
    +
    warning: 1 line adds whitespace errors.
mkdir build
cd build
../../bjassimpcmake.sh
../../bjremoveusrinclude.sh
../../bjassimpmake.sh
make install
    installs to /usr/local/ps3dev/portlibs/ppu

Contents of bjpsl1ghtassimp.patch can be found in the bitbucket repo

Contents of bjassimpcmake.sh:

# -DCMAKE_CXX_EXTENSIONS=OFF
AS=ppu-as \
CC=ppu-gcc \
CXX=ppu-g++ \
AR=ppu-ar \
LD=ppu-gcc \
STRIP=ppu-strip \
OBJCOPY=ppu-objcopy \
PATH=$PS3DEV/bin:$PS3DEV/ppu/bin:$PATH \
PORTLIBS=$PS3DEV/portlibs/ppu \
LIBPSL1GHT_INC="-I$PSL1GHT/ppu/include -I$PSL1GHT/ppu/include/simdmath" \
LIBPSL1GHT_LIB="-L$PSL1GHT/ppu/lib" \
CFLAGS_INIT="" \
CXXFLAGS_INIT="-D_GLIBCX11_USE_C99_STDIO" \
CMAKE_INCLUDE_DIRECTORIES=$PSL1GHT/ppu/include:$PSL1GHT/ppu/include/simdmath \
CMAKE_LIBRARY_DIRECTORIES=$PSL1GHT/ppu/lib \
CMAKE_PREFIX_PATH=$PSL1GHT/ppu \
cmake ../ \
-DCMAKE_CXX_COMPILER=ppu-g++ \
-DCMAKE_C_COMPILER=ppu-gcc \
-DCMAKE_INSTALL_PREFIX=/usr/local/ps3dev/portlibs/ppu \
-DCMAKE_CXX_FLAGS="-MMD -MP -MF \
-D_GLIBCXX11_USE_C99_STDIO -mcpu=cell -mhard-float \
-fmodulo-sched -ffunction-sections -fdata-sections \
-D__RSX__ \
-D_XOPEN_SOURCE=500 \
-I$PSL1GHT/ppu/include -I$PSL1GHT/ppu/include/simdmath \
-I$PS3DEV/portlibs/ppu/include \
-L$PSL1GHT/ppu/lib -L$PSL1GHT/portlibs/ppu/lib" \
-DCMAKE_C_FLAGS="-MMD -MP -MF \
-D_GLIBCXX11_USE_C99_STDIO -mcpu=cell -mhard-float \
-fmodulo-sched -ffunction-sections -fdata-sections \
-D__RSX__ \
-D_XOPEN_SOURCE \
-I$PSL1GHT/ppu/include -I$PSL1GHT/ppu/include/simdmath \
-I$PS3DEV/portlibs/ppu/include \
-L$PSL1GHT/ppu/lib -L$PSL1GHT/portlibs/ppu/lib" \
-DBUILD_SHARED_LIBS=0 \
-DASSIMP_BUILD_TESTS=0

Contents of bjremoveusrinclude.sh:

sed -i 's/\-I\/usr\/include//' ./code/CMakeFiles/assimp.dir/flags.make 
#sed -i 's/\-I\/usr\/include//' ./test/CMakeFiles/unit.dir/flags.make 
sed -i 's/\-I\/usr\/include//' ./tools/assimp_cmd/CMakeFiles/assimp_cmd.dir/flags.make

Contents of bjassimpmake.sh:

AS=ppu-as \
CC=ppu-gcc \
CXX=ppu-g++ \
AR=ppu-ar \
LD=ppu-gcc \
STRIP=ppu-strip \
OBJCOPY=ppu-objcopy \
PATH=$PS3DEV/bin:$PS3DEV/ppu/bin:$PATH \
PORTLIBS=$PS3DEV/portlibs/ppu \
LIBPSL1GHT_INC="-I$PSL1GHT/ppu/include -I$PSL1GHT/ppu/include/simdmath" \
LIBPSL1GHT_LIB="-L$PSL1GHT/ppu/lib" \
CFLAGS_INIT="" \
CXXFLAGS_INIT="-D_GLIBCX11_USE_C99_STDIO" \
CMAKE_INCLUDE_DIRECTORIES=$PSL1GHT/ppu/include:$PSL1GHT/ppu/include/simdmath \
CMAKE_LIBRARY_DIRECTORIES=$PSL1GHT/ppu/lib \
CMAKE_PREFIX_PATH=$PSL1GHT/ppu \
C_INCLUDE_PATH=$CMAKE_INCLUDE_DIRECTORIES \
CPLUS_INCLUDE_PATH=$CMAKE_INCLUDE_DIRECTORIES \
LIBRARY_PATH=$CMAKE_LIBRARY_DIRECTORIES \
make -j4 VERBOSE=ON

08/27/2021

Got my version of the glassview sample working. The two hard parts were installing assimp (above), and an issue that turned out to be related to the shader attributes. I had to bind the attribute locations specifically instead of getting their locations:

// This part seems to be necessary; getting the attrib location seemed to
    // cause the program to not work...
    glBindAttribLocation(program, 0, "vertex");
    glBindAttribLocation(program, 1, "normal");
    glBindAttribLocation(program, 2, "uv");
    vertex_location = 0;
    normal_location = 1;
    uv_location = 2;

    // Link the program for real:
    glLinkProgram(program);
    glValidateProgram(program);

I also ended up not using the original author's glassimp library; mainly because I removed it while debugging before I figured out the above attrib location issue and never put it back in to see if it still worked...

12/1/2021

Started porting my 2048 game implementation to the PS3. One TODO is to change the window size from 640x480 to fill the PS3 screen instead, and resize tiles/font accordingly.

Currently the game is hard coded to load assets from /dev_hdd0/twentyfortyeight/data. This directory is configured in Makefile.ps3.

For some reason the ps3 controller directional hats are mapped to buttons instead of JOYHAT events. And of course, the button numbers don't have the same mapping as the PS2 port and PC controllers.

I spent a fair amount of time messing with audio. The SDL_mixer port kind of works with Mix_Music, except the audio is really fuzzy/distorted/bad quality. After tring different Mix_OpenAudio settings I tried Mix_Chunk instead and that works fine with audio quality as expected. I had to re-save the .wav file exported from audacity as 16bit wav though in order for it to load successfully.

The only other big TODO left is save data. That'll be interesting to see if I can get the ps3's save menu system working.

03/07/2022

Started PS3/PSL1GHT implementation of the GameMath library. Referencing the PSL1GHT vectormath library, the core vector functions come from the altivec and simdmath headers. I'm not sure yet if these instructions are specific to the PS3 Cell CPU or more general to the powerpc architecture. Searching one of the function names I found this reference here of available vector instructions, which operate on vec4's. As a side note I also found an interesting reference, "Universal SIMD-Mathlibrary" on https://webuser.hs-furtwangen.de/~dersch/, which has description "single precision floating point vector datatypes are provided for the SIMD-platforms x86 (SSE2), PowerPC and Cell." I haven't looked at it yet.

I've started with the cpp implementation and have replaced most of the vec3 instructions using these, just ignoring the 4th component. So far, all of the tests in the test program pass. I should also add timestamps/performance counting for a speedup comparison, but that's a TODO for now

Test testVec3Add passed
Test testVec3Sub passed
Test testVec3PlusEquals passed
Test testVec3MinusEquals passed
Test testVec3TimesScale passed
Test testVec3Dot passed
Test testVec3Cross passed
Test testVec3Normalize passed
Test testMat4DefaultConstructor passed
Test testMat4CopyConstructor passed
Test testMat4Translate passed
Test testMat4TranslateVec3 passed
Test testMat4RotateVec3 passed
Test testMat4ScaleVec3 passed
Test testMat4SRTMat passed
Test testMat4Inverse passed
Test testMat4Transpose passed
Test testQuatMul passed
Test testQuatRot passed
Test testQuatNormalize passed
20/20 tests passed

03/17/2022

Refactored some of the classes including Renderer, Texture, VertexBuffer in a new iqmtest folder in the ps3devprogs repo, which uses the GameMath library ps3 impl. Initially tested a simple quad. Didn't work initially, found a bug in the perspective matrix function. Next, started adding IQM model support. The base mesh works, and I was kind of expecting the animation to work right off the bat but no such luck. I've started debugging outputs against the PC demo version and the base frame matrices don't seem to quite match, so there's probably a bug in the GameMath impl somewhere.

03/21/2022

Switching to the cpp implementation of GameMath causes the IQM anim to work, so the next step is to debug the difference. Had to add the Perspective function to the cpp implemenation in order to compile, which I just copied the PS3 version and replaced the PSL1GHT vec4 with the cpp Vec4.

03/30/2022

Figured out the differnce - the issue was I had added a 4th component (w) to the ps3 Vec3 impl in order to support SIMD operations. In the IqmMesh class, the size of a Vec3 was assumed to be 3*sizeof(float), so casting a pointer to floats to a Vec3 was of the incorrect size, and incrementing the Vec3 pointer skipped a float, resulting in incorrect values. To fix this, the float pointers were left as floats and the pointers were increased by 3 instead for each vertex calculation:

     // The actual vertex generation based on the matrixes follows...
-    const GameMath::Vec3* srcpos = (const GameMath::Vec3*)inposition;
-    const GameMath::Vec3* srcnorm = (const GameMath::Vec3*)innormal;
-    const GameMath::Vec4* srctan = (const GameMath::Vec4*)intangent;
-    GameMath::Vec3* dstpos = (GameMath::Vec3*)outposition;
-    GameMath::Vec3* dstnorm = (GameMath::Vec3*)outnormal;
-    GameMath::Vec3* dsttan = (GameMath::Vec3*)outtangent;
-    GameMath::Vec3* dstbitan = (GameMath::Vec3*)outbitangent;
+    const float* srcpos = (const float*)inposition;
+    const float* srcnorm = (const float*)innormal;
+    const float* srctan = (const float*)intangent;
+    float* dstpos = (float*)outposition;
+    float* dstnorm = (float*)outnormal;
+    float* dsttan = (float*)outtangent;
+    float* dstbitan = (float*)outbitangent;
...
-        srcpos++;
-        srcnorm++;
-        srctan++;
-        dstpos++;
-        dstnorm++;
-        dsttan++;
-        dstbitan++;
+        srcpos += 3;
+        srcnorm += 3;
+        srctan += 3;
+        dstpos += 3;
+        dstnorm += 3;
+        dsttan += 3;
+        dstbitan += 3;

04/01/2022

Added FPS camera and TileFloor, both essentially copied from the wii dev program. Had to add normals into the tile floor vertices. Also for some reason one of the repeating tiles is always screwed up. It was cool to see that FPS camera required no modification at all, aside from changing the include from Wii to PS3 for GameMath (#include <GameMath/wii/GameMath.h> to #include <GameMath/ps3/GameMath.h>)

04/05/2022

I was too cocky - the iqmtest isn't working on actual hardware :(. Should have tested it sooner...

I started looking for other code references to help debug. Found a tiny3D implementation: https://github.com/wargio/tiny3D, which also uses https://github.com/wargio/ps3soundlib. To build:

git clone https://github.com/wargio/ps3soundlib.git
cd ps3soundlib
# auto installs to /usr/local/ps3dev, make sure you don't need sudo to access
make
cd ..
git clone https://github.com/wargio/tiny3D.git
cd tiny3D
make

I tested the spheres3D example, which didn't work with ps3load remotely but did work when copying the .self to the ps3 and launching via multiman. Next step is to start looking at the tiny3D source code and maybe experiementing with it some

04/09/2022

Got IQM animation basically working on console using tiny3d if I use the cpp gamemath impl instead of simd version. Not sure what the simd version bug is yet. Also the textures are sort of weird if you move and I think my FPS camera isn't working right. I'll know for sure once I add the tiled floor.

Using tiny3d is basically like the older fixed-function OpenGL pipeline. It might be nice to try and modify the code a bit to be able to directly copy the vertices to the rsx buffer instead of copying vertices individually. A basic tiny3d example I confirmed works to draw a triangle is:

#include <stdio.h>

#include <tiny3d.h>
#include <matrix.h>

int main()
{
    MATRIX projMat; // projection matrix
    MATRIX mvMat; // model-view matrix

    // 4MB vertex data
    if (tiny3d_Init(TINY3D_Z16 | 4*1024*1024) < 0) {
        printf("Error init tiny3D\n");
        return 1;
    }

    projMat = MatrixProjPerspective(
        60.0f, // fov degrees
        640.0f/480.0f, // aspect
        0.00125f, // near
        300.0f // far
    );
    mvMat = MatrixTranslation(0.0f, 0.0f, 5.0f);

    for (;;)
    {
        // alpha is first in color (argb)
        tiny3d_Clear(0xFF000000, TINY3D_CLEAR_ALL);

        // enable alpha test and alpha blending
        blend_src_func srcFun = (blend_src_func)(TINY3D_BLEND_FUNC_SRC_RGB_SRC_ALPHA | TINY3D_BLEND_FUNC_SRC_ALPHA_SRC_ALPHA);
        blend_dst_func dstFun = (blend_dst_func)(TINY3D_BLEND_FUNC_DST_RGB_ONE_MINUS_SRC_ALPHA | TINY3D_BLEND_FUNC_DST_ALPHA_ZERO);
        blend_func blndFun = (blend_func)(TINY3D_BLEND_RGB_FUNC_ADD | TINY3D_BLEND_ALPHA_FUNC_ADD);
        tiny3d_AlphaTest(1, 0x10, TINY3D_ALPHA_FUNC_GEQUAL);
        tiny3d_BlendFunc(1, srcFun, dstFun, blndFun);

        // Set 3d mode
        tiny3d_Project3D();

        // Set model,view,projection matrix
        tiny3d_SetProjectionMatrix(&projMat);
        tiny3d_SetMatrixModelView(&mvMat);

        // set polygon type
        tiny3d_SetPolygon(TINY3D_TRIANGLES);
        
        // Draw vertices (position must be first)
        tiny3d_VertexPos(-0.5f, -0.5f, 0.0f);
        tiny3d_VertexFcolor(1.0f, 0.0, 0.0, 1.0f);
        tiny3d_VertexPos(0.5f, -0.5f, 0.0f);
        tiny3d_VertexFcolor(1.0f, 0.0f, 0.0f, 1.0f);
        tiny3d_VertexPos(0.0f, 0.5f, 0.0f);
        tiny3d_VertexFcolor(1.0f, 0.0f, 0.0f, 1.0f);

        // end of vertex list
        tiny3d_End();

        // here projection, modelview, texture can be changed and tiny3d_setpolygon can
        // be called again to start drawing more vertices

        // end of frame
        tiny3d_Flip();
    }

    return 0;
}

Tiny3D uses I think left-handed coordinates (+x=right, +y=up, +z=forward). To use my GameMath library I added the following conversion functions:

inline GameMath::Mat4 gameMath2TinyProjMat(GameMath::Mat4& mat) {
    GameMath::Mat4 result = Transpose(mat);
    result.v[1][1] = -result.v[1][1];
    result.v[2][2] = -result.v[2][2];
    result.v[3][2] = -result.v[3][2];
    result.v[2][3] = -result.v[2][3] * 0.5f;
    return result;
}
inline GameMath::Mat4 gameMath2TinyModelViewMat(GameMath::Mat4& mat) {
    GameMath::Mat4 result = mat;
    result.v[3][2] = -result.v[3][2];
    return result;
}

The full current iqm code is at https://bitbucket.org/williamblair/ps3devprogs/src/master/tiny3diqm/

07/24/2022

Implemented Julia set calculation in order to learn SPU parallel programming. I struggled alot with getting DMA transfers between the PPU/SPUs to work properly. Eventually found this reference and took alot from it: https://github.com/zerkman/fractal. I started with a serial PPU implementation and added parallelism in stepped increments, recording the average FPS at each step (this was with a texture width of 512; the current version has a width of 1024 so it's about half as fast.)

Impl	Avg FPS
Serial PPU impl	1.6
Initial SPU impl	15-16
Remove Texture Data Copy	26-27
Initial SPU Intrinsics	43-44
SPU Hsv2Rgb lookup	51-52
-O3, -funroll-loops flags	56-57

I also added inputs to zoom in, move around, and change the complex calculation constant. The result is very fun to play around with:

The source can be found at: https://bitbucket.org/williamblair/ps3spufractal/src/master/

<-- Back to Home