


In an iOS application, I am trying to strip diacritics from a SQLite field in a table with 1 million rows.


I got a great start from an earlier answer by Rob. For the purposes of this question, I'm trying to copy a table to another using an INSERT INTO ... SELECT ... FROM ... statement. This works fine until I introduce a C function that strips accents and other diacritics from a field:

INSERT INTO DestinationTable (MasterField, SubField, IndexID) SELECT unaccented(MasterField), SubField, IndexID FROM SourceTable


With the unaccented() function introduced, I often get an EXC_BAD_ACCESS which I'll specify below. But tantalizingly, the SQLite operation will sometimes complete successfully for all 1 million rows. When I extend the load to replicate across 5 million rows, the application will always crash.


Here is my source code, with the point of EXC_BAD_ACCESS commented at the bottom of the first function:

#import <sqlite3.h>

sqlite3 *db;

void unaccented(sqlite3_context *context, int argc, sqlite3_value **argv)
    if (argc != 1 || sqlite3_value_type(argv[0]) != SQLITE_TEXT) {

    @autoreleasepool {
        NSMutableString *string = [NSMutableString stringWithUTF8String:(const char *)sqlite3_value_text(argv[0])];
        CFStringTransform((__bridge CFMutableStringRef)string, NULL, kCFStringTransformStripCombiningMarks, NO);

        char *buf = sqlite3_malloc(sizeof(char) * [string length] + 1);
        strcpy(buf, [string UTF8String]);

        sqlite3_result_text(context, buf, -1, sqlite3_free);
    } // This is where I usually get "EXC_BAD_ACCESS (code=1, address=...)"

@implementation MyClass

- (void)myMethod
    NSArray *documentPaths = NSSearchPathForDirectoriesInDomains(NSCachesDirectory, NSUserDomainMask, YES);
    NSString *cachesDir = [documentPaths objectAtIndex:0];
    NSFileManager *fileManager = [NSFileManager defaultManager];
    NSError *error = nil;

    NSDirectoryEnumerator *directoryEnumerator = [fileManager enumeratorAtPath:[[NSBundle mainBundle] resourcePath]];

    NSString *fileName;
    while (fileName = [directoryEnumerator nextObject])
        if ([fileName hasSuffix:@".sqlite"])
            if (![fileManager fileExistsAtPath: [cachesDir stringByAppendingPathComponent:fileName]] == YES)
                if (![fileManager copyItemAtPath:[[NSBundle mainBundle] pathForResource:fileName ofType:@""] toPath:[cachesDir stringByAppendingPathComponent:fileName] error:&error])
                    NSLog(@"Error description - %@ \n", [error localizedDescription]);
                    NSLog(@"Error reason - %@", [error localizedFailureReason]);
                    NSLog(@"HAI %@", fileName);

                    [self openDB:[cachesDir stringByAppendingPathComponent:fileName]];

                    NSString *sqlCommand = @"INSERT INTO DestinationTable (MasterField, SubField, IndexID) SELECT unaccented(MasterField), SubField, IndexID FROM SourceTable";

                    char *sqlError;
                    if (sqlite3_exec(db, [sqlCommand UTF8String], NULL, NULL, &sqlError) != SQLITE_OK)
                        NSLog(@"sqlite3_exec INSERT NOT SQLITE_OK with error: %d: %s", sqlite3_errcode(db), sqlite3_errmsg(db));

                    [self closeDB];

                    NSLog(@"KTHXBYE %@", fileName);

- (void) openDB: (NSString *)filePath
    if (sqlite3_open([filePath UTF8String], &db) != SQLITE_OK)
        [self createUnaccentedFunction];

- (void) closeDB

- (void)createUnaccentedFunction
    if (sqlite3_create_function_v2(db, "unaccented", 1, SQLITE_ANY, NULL, &unaccented, NULL, NULL, NULL) != SQLITE_OK)
        NSLog(@"%s: sqlite3_create_function_v2 error: %s", __FUNCTION__, sqlite3_errmsg(db));


Any observations on what I'm doing wrong?



char *buf = sqlite3_malloc(sizeof(char) * [string length] + 1);
strcpy(buf, [string UTF8String]);

you are allocating length Unicode characters, instead of UTF-8 bytes.

strcpy may then try to write invalid memory.


I suggest you to let SQLite copy string using:

sqlite3_result_text(context, [string UTF8String], -1, SQLITE_TRANSIENT);


