Canonicalize JSON, so equivalent objects have the same hash

I store JSON objects in a database. Many, perhaps most of these objects will be duplicated, so I would like to use them for something like a SHA hash to avoid creating extra extra records.

The problem is that the moment I want to write them to the database, I no longer have JSON bytes - only Foundation objects returned NSJSONSerialization. Since it NSDictionarydoes not give any guarantees regarding the order of the keys (and even if it is, I’m not sure that the server from which I receive the data), I can’t be sure that it NSJSONSerializationwill display each field of the object in the same order every time I I call him. This means that the same object can have different digests, defeating my attempts to save space.

Is there an Objective-C jSON library that always writes the same JSON for equivalent objects, presumably sorting keys before writing them? I focus on iOS 7, but this is probably a Foundation level issue.

+3
source share
2 answers

Instead of trying to write my own JSON serializer, I decided to trick Apple into doing what I want with a proxy proxy.

Using:

NSData * JSONData = [NSJSONSerialization dataWithJSONObject:[jsonObject objectWithSortedKeys] options:0 error:&error];

Title:

#import <Foundation/Foundation.h>

@interface NSObject (sortedKeys)

/// Returns a proxy for the object in which all dictionary keys, including those of child objects at any level, will always be enumerated in sorted order.
- (id)objectWithSortedKeys;

@end

the code:

#import "NSObject+sortedKeys.h"

/// A CbxSortedKeyWrapper intercepts calls to methods like -allKeys, -objectEnumerator, -enumerateKeysAndObjectsUsingBlock:, etc. and makes them enumerate a sorted array of keys, thus ensuring that keys are enumerated in a stable order. It also replaces objects returned by any other methods (including, say, -objectForKey: or -objectAtIndex:) with wrapped versions of those objects, thereby ensuring that child objects are similarly sorted. There are a lot of flaws in this approach, but it works well enough for NSJSONSerialization.
@interface CbxSortedKeyWrapper: NSProxy

+ (id)sortedKeyWrapperForObject:(id)object;

@end

@implementation NSObject (sortedKeys)

- (id)objectWithSortedKeys {
    return [CbxSortedKeyWrapper sortedKeyWrapperForObject:self];
}

@end

@implementation CbxSortedKeyWrapper {
    id _representedObject;
    NSArray * _keys;
}


+ (id)sortedKeyWrapperForObject:(id)object {
    if(!object) {
        return nil;
    }

    CbxSortedKeyWrapper * wrapper = [self alloc];
    wrapper->_representedObject = [object copy];

    if([wrapper->_representedObject respondsToSelector:@selector(allKeys)]) {
        wrapper->_keys = [[wrapper->_representedObject allKeys] sortedArrayUsingSelector:@selector(compare:)];
    }

    return wrapper;
}

- (NSMethodSignature*)methodSignatureForSelector:(SEL)aSelector {
    return [_representedObject methodSignatureForSelector:aSelector];
}

- (void)forwardInvocation:(NSInvocation*)invocation {
    [invocation invokeWithTarget:_representedObject];

    BOOL returnsObject = invocation.methodSignature.methodReturnType[0] == '@';

    if(returnsObject) {
        __unsafe_unretained id out = nil;
        [invocation getReturnValue:&out];

        __unsafe_unretained id wrapper = [CbxSortedKeyWrapper sortedKeyWrapperForObject:out];
        [invocation setReturnValue:&wrapper];
    }
}

- (NSEnumerator *)keyEnumerator {
    return [_keys objectEnumerator];
}

- (NSEnumerator *)objectEnumerator {
    if(_keys) {
        return [[self allValues] objectEnumerator];
    }
    else {
        return [CbxSortedKeyWrapper sortedKeyWrapperForObject:[_representedObject objectEnumerator]];
    }
}

- (NSArray *)allKeys {
    return _keys;
}

- (NSArray *)allValues {
    return [CbxSortedKeyWrapper sortedKeyWrapperForObject:[_representedObject objectsForKeys:_keys notFoundMarker:[NSNull null]]];
}

- (void)enumerateKeysAndObjectsUsingBlock:(void (^)(id key, id obj, BOOL *stop))block {
    [_keys enumerateObjectsUsingBlock:^(id key, NSUInteger idx, BOOL *stop) {
        id obj = [CbxSortedKeyWrapper sortedKeyWrapperForObject:_representedObject[key]];
        block(key, obj, stop);
    }];
}

- (void)enumerateKeysAndObjectsWithOptions:(NSEnumerationOptions)opts usingBlock:(void (^)(id key, id obj, BOOL *stop))block {
    [_keys enumerateObjectsWithOptions:opts usingBlock:^(id key, NSUInteger idx, BOOL *stop) {
        id obj = [CbxSortedKeyWrapper sortedKeyWrapperForObject:_representedObject[key]];
        block(key, obj, stop);
    }];
}

- (void)enumerateObjectsUsingBlock:(void (^)(id obj, NSUInteger idx, BOOL *stop))block {
    [_representedObject enumerateObjectsUsingBlock:^(id obj, NSUInteger idx, BOOL * stop) {
        block([CbxSortedKeyWrapper sortedKeyWrapperForObject:obj], idx, stop);
    }];
}

- (void)enumerateObjectsWithOptions:(NSEnumerationOptions)opts usingBlock:(void (^)(id obj, NSUInteger idx, BOOL *stop))block {
    [_representedObject enumerateObjectsWithOptions:opts usingBlock:^(id obj, NSUInteger idx, BOOL * stop) {
        block([CbxSortedKeyWrapper sortedKeyWrapperForObject:obj], idx, stop);
    }];
}

- (void)enumerateObjectsAtIndexes:(NSIndexSet *)indexSet options:(NSEnumerationOptions)opts usingBlock:(void (^)(id obj, NSUInteger idx, BOOL *stop))block {
    [_representedObject enumerateObjectsAtIndexes:indexSet options:opts usingBlock:^(id obj, NSUInteger idx, BOOL * stop) {
        block([CbxSortedKeyWrapper sortedKeyWrapperForObject:obj], idx, stop);
    }];
}

- (NSUInteger)countByEnumeratingWithState:(NSFastEnumerationState *)state objects:(__unsafe_unretained id *)stackbuf count:(NSUInteger)len {
    NSUInteger count = [_keys countByEnumeratingWithState:state objects:stackbuf count:len];
    for(NSUInteger i = 0; i < count; i++) {
        stackbuf[i] = [CbxSortedKeyWrapper sortedKeyWrapperForObject:stackbuf[i]];
    }
    return count;
}

@end
0
source

Firstly, valid JSON (text) is not suitable for generating a hash either: for a particular object, there may be many valid JSON forms that represent this object:

  • JSON is basically β€œtext,” and its character encoding is Unicode. Unicode has five different Unicode schemes: it can be UTF-8, UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE. Each scheme will give a different hash, even the object is the same.

  • JSON can contain spaces and tabs (they are also "pretty printed").

  • JSON escape- unicode. "solidus" / .

  • , JSON . , , JSON- undefined.

, JSON () , .

JSON , , ( ), .

-, "" / JSON: JSON "" JSON , JSON, .

, / . , , , , JSON (. , JSON ). , , "solidus", .

, NSJSONSerialization "", (, ) "" JSON, JSON.

, , JSON, . /, , .

( "Canonicalize JSON" ) / JSON, Foundation: (, ++ ), JSON ( JSON), .

, , . / JSON API Objective-C, ++. , , , (JPJSONWriter : JPJsonWriterSortKeys, JPJsonWriterEscapeSolidus). , , (Objective-C API, ++ ) ).

: JPJson ( )

JPJson " ". , " " " ". "HashGenerator", .

, , JSON : jsonlite, JsonLiteSerializer serializeDictionary: .

.

0

All Articles