Skip to Navigation | Skip to Content

Ref_impl_Java mailing list archives

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: testParsingWithoutUTF8Encoding Error


That's a good idea.
Does that work for you on your Mac, Fábio?

Cheers
Sebastian

Rong Chen wrote:
Hi Sebastian, Fábio

If "UTF-8" is specified in the parser constructor, the test will be identical to the previous one, testParsingWithUTF8Encoding. So perhaps we should just specify an encoding other than UTF-8, for example "ISO-8859-1".


Cheers,
Rong

ISO-8859-1 


On 9 February 2010 11:34, Sebastian Garde <sebastian.garde@oceaninformatics.com> wrote:
Hi Fábio,

sorry that was the wrong test I looked at.

I think you are right, we need to construct the parser specifying UTF-8 here.


ADLParser parser = new ADLParser(loadFromClasspath(
                "adl-test-entry.unicode_BOM_support.test.adl"), "UTF-8");


I have checked in the updated code, can you please check if this works ok for you?

Cheers

Sebastian

Fábio Nogueira de Lucena wrote:
Hi Sebastian,

the method testParsingWithUTF8Encoding works fine as expected. The only test that fails is testParsingWithoutUTF8Encoding. 

----------------- WHAT's SEEMS TO BE HAPPENING  ----------------------

testParsingWithoutUTF8Encoding calls ADLParser with only one argument which calls the constructor 

  public SimpleCharStream(java.io.InputStream dstream, String encoding, int startline,
  int startcolumn, int buffersize)

with parameter encoding == null. In this case InputStreamReader with just one argument is used. In other words, inputStream instance variable of SimpleCharStream uses default encoding for reading. In Macs default encoding is Mac Roman (Java). Instead of using default encoding maybe ADLParser should use UTF-8 (I am not sure if it is right). When ADLParser uses UTF-8 on Mac everything works fine (testParsingWithUTF8Encoding passes). 

-----------------------------------------------------------------------------------------------------

That's the exception I got with a fresh checkout (revision 505):

Running se.acode.openehr.parser.UnicodeBOMSupportTest
se.acode.openehr.parser.TokenMgrError: Lexical error at line 1, column 1.  Encountered: "\u00d4" (212), after : ""
        at se.acode.openehr.parser.ADLParserTokenManager.getNextToken(ADLParserTokenManager.java:31649)
        at se.acode.openehr.parser.ADLParser.jj_consume_token(ADLParser.java:7075)
        at se.acode.openehr.parser.ADLParser.archetype(ADLParser.java:214)
        at se.acode.openehr.parser.ADLParser.parse(ADLParser.java:101)
        at se.acode.openehr.parser.UnicodeBOMSupportTest.testParsingWithoutUTF8Encoding(UnicodeBOMSupportTest.java:48)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at junit.framework.TestCase.runTest(TestCase.java:154)
        at junit.framework.TestCase.runBare(TestCase.java:127)
        at junit.framework.TestResult$1.protect(TestResult.java:106)
        at junit.framework.TestResult.runProtected(TestResult.java:124)
        at junit.framework.TestResult.run(TestResult.java:109)
        at junit.framework.TestCase.run(TestCase.java:118)
        at junit.framework.TestSuite.runTest(TestSuite.java:208)
        at junit.framework.TestSuite.run(TestSuite.java:203)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.maven.surefire.junit.JUnitTestSet.execute(JUnitTestSet.java:213)
        at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
        at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:127)
        at org.apache.maven.surefire.Surefire.run(Surefire.java:177)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:345)
        at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1009)

Thanks in advance.

Fábio


On Mon, Feb 8, 2010 at 10:43 AM, Sebastian Garde <sebastian.garde@oceaninformatics.com> wrote:
Hi Fábio,

I think the test itself is probably right.

    public void testParsingWithUTF8Encoding() throws Exception {
        try {
            ADLParser parser = new ADLParser(loadFromClasspath(
                "adl-test-entry.unicode_BOM_support.test.adl"), "UTF-8");
            parser.parse();
       
        } catch(Throwable t) {
            fail("failed to parse BOM with UTF8 encoding..");
        }
    }

If the test fails, I believe something is going wrong in the ADLParser as the archetype cannot be parsed.
Something in the ADLParser is going wrong in a Mac environment without UTF-8 default encoding.
I unfortunately don't own a Mac, and not sure if Rong does, but maybe you can get the error message to us to see what the parser is expecting?

I also notice that the archetype is in Windows format (i.e. using pair of CR and LF characters to terminate lines, whereas Unix uses an LF character only and Mac uses a CR character only.)
Maybe you can convert it to Unix or Mac format and see if that helps (but be sure to keep the invisible Byte order mark (BOM) at the beginning of the file or test may be ok, but not testing what it should test anymore)

Regards
Sebastian


Fábio Nogueira de Lucena wrote:
Hi,

the unit test testParsingWithoutUTF8Encoding fails on my Mac. After trying to make my Mac use UTF-8 by default instead of Mac Roman I gave up. However, i am not sure the test is right. The test should work even with a different default encoding, or not?

Thanks in advance.

Fábio


_______________________________________________ Ref_impl_java mailing list Ref_impl_java@openehr.org http://lists.chime.ucl.ac.uk/mailman/listinfo/ref_impl_java

--
 
Ocean Informatics
Dr Sebastian Garde
Senior Developer
Ocean Informatics
Dr. sc. hum., Dipl.-Inform. Med, FACHI

Skype: gardeseb


_______________________________________________
Ref_impl_java mailing list
Ref_impl_java@openehr.org
http://lists.chime.ucl.ac.uk/mailman/listinfo/ref_impl_java



_______________________________________________ Ref_impl_java mailing list Ref_impl_java@openehr.org http://lists.chime.ucl.ac.uk/mailman/listinfo/ref_impl_java

--
 
Ocean Informatics
Dr Sebastian Garde
Senior Developer
Ocean Informatics
Dr. sc. hum., Dipl.-Inform. Med, FACHI

Skype: gardeseb


_______________________________________________
Ref_impl_java mailing list
Ref_impl_java@openehr.org
http://lists.chime.ucl.ac.uk/mailman/listinfo/ref_impl_java



_______________________________________________ Ref_impl_java mailing list Ref_impl_java@openehr.org http://lists.chime.ucl.ac.uk/mailman/listinfo/ref_impl_java

--
 
Ocean Informatics
Dr Sebastian Garde
Senior Developer
Ocean Informatics
Dr. sc. hum., Dipl.-Inform. Med, FACHI

Skype: gardeseb

_______________________________________________
Ref_impl_java mailing list
Ref_impl_java@openehr.org
http://lists.chime.ucl.ac.uk/mailman/listinfo/ref_impl_java