8
our $VERSION = '3.003'; # Don't forget to update the TestCompat set for testing against installed decoders!
8
our $VERSION = '3.005_001'; # Don't forget to update the TestCompat set for testing against installed decoders!
9
9
our $XS_VERSION = $VERSION; $VERSION= eval $VERSION;
11
11
# not for public consumption, just for testing.
257
257
override this optimization and use a standard REFN ARRAY style tag output. This
258
258
is primarily useful for producing canoncial output and for testing Sereal itself.
260
See L</CANONICAL REPRESENTATION> for why you might want to use this, and
261
for the various caveats involved.
262
265
Normally C<Sereal::Encoder> will output hashes in whatever order is convenient,
626
There's also a few cases where Sereal will produce different documents
627
for values that you might think are the same thing, because if you
628
e.g. compared them with C<eq> or C<==> in perl itself would think they
629
were equivalent. However for the purposes of serialization they're not
632
A good example of these cases is where L<Test::Deep> and Sereal's
633
canonical mode differ. We have tests for some of these cases in
634
F<t/030_canonical_vs_test_deep.t>. Here's the issues we've noticed so
639
=item Sereal considers ASCII strings with the UTF-8 flag to be different from the same string without the UTF-8 flag
643
my $language_code = "en";
647
my $language_code = "en";
650
Sereal's canonical mode will encode these strings differently, as it
651
should, since the UTF-8 flag will be passed along on interpolation.
653
But this can be confusing if you're just getting some user-supplied
654
ASCII strings that you may inadvertently toggle the UTF-8 flag on,
655
e.g. because you're comparing an ASCII value in a database to a value
656
submitted in a UTF-8 web form.
658
=item Sereal will encode strings that look like numbers as strings, unless they've been used in numeric context
660
I.e. these values will be encoded differently, respectively:
663
my $IV_y = "12345" + 0;
665
my $NV_y = "12.345" + 0;
667
But as noted above something like Test::Deep will consider these to be
672
We might produce certain aggressive flags to the canonical mode in the
673
future to deal with this. For the cases noted above some combination
674
of turning the UTF-8 flag on on all strings, or stripping it from
675
strings that have it but are ASCII-only would "work", similarly we
676
could scan strings to see if they match C<looks_like_number()> and if
679
This would produce output that either would be a lot bigger (having to
680
encode all numbers as strings), or would be more expensive to generate
681
(having to scan strings for numeric or non-ASCII context), and for
682
some cases like the UTF-8 flag munging wouldn't be suitable for
683
general use outside of canonicialization.
625
687
Often, people don't actually care about "canonical" in the strict sense