tyler.kendrick

Caffeine Addict
Veteran
Joined
Nov 21, 2014
Messages
52
Reaction score
14
First Language
English
Primarily Uses
* Edit: Neglected to put [Ace] in the header.

On Rubular, the following regular expression evaluates 7 match groups - as expected

/^<(\w+)\s*(\w+=".*")*\s*(??:\/\s*>)|(?:>(.*)<\s*\/\s*\1\s*>))$/mHowever, the following line only returns the last match in a note box.

/^<(\w+)\s*(\w+=".*")*\s*(??:\/\s*>)|(?:>(.*)<\s*\/\s*\1\s*>))$/m.match(note) { |m| name = m[1] attributes = @options[:parse_attr].call(m[2]) innerText = @options[:parse_text].call(m[3]) msgbox_p("#{name} found with attributes: #{attributes.join(',')}. innerText=#{innerText}")}The note contains the following text:

This be text

<tag/>

<tag2></tag2>

<tag3 value="text" />

<tag4>innerText</tag4>

<tag5 value="text" value2="text">

inertia

</tag5>

 

<alert>Message from a1</alert>

<actor_tag>inner text</actor_tag>
The text is the same for rubular and the note section this was called from.

Any ideas why the engine appears to parse differently from the note section?  Is there some oddness with line-breaks that I'm neglecting?
 
Last edited by a moderator:

FenixFyreX

Fire Deity
Veteran
Joined
Mar 1, 2012
Messages
434
Reaction score
311
First Language
English
Primarily Uses
Use the method String#scan, it'll return all of the matches, like so:

Code:
string = <<HDOC<tag/><tag2></tag2><tag3 value="text" /><tag4>innerText</tag4><tag5 value="text" value2="text">inertia</tag5> <alert>Message from a1</alert><actor_tag>inner text</actor_tag>HDOCstring.scan(/^<(\w+)\s*(\w+=".*")*\s*(??:\/\s*>)|(?:>(.*)<\s*\/\s*\1\s*>))$/m)# => [['tag', nil, nil, nil], ['tag2', nil, nil], ['tag3', 'value="text"', nil]] # and so on, so forth
 

tyler.kendrick

Caffeine Addict
Veteran
Joined
Nov 21, 2014
Messages
52
Reaction score
14
First Language
English
Primarily Uses
Still doesn't address the issue.

Put the following call-script on an event:

actor_id = 1actor = $data_actors[actor_id]regexp = /^<(\w+)\s*(\w+=".*")*\s*(??:\/\s*>)|(?:>(.*)<\s*\/\s*\1\s*>))$/mnote = actor.notematches = note.scan(regexp) { |x| msgbox_p("matched: " + x.inspect)}The put the following text on the actor's note section:

This be text

<tag/>

<tag2></tag2>

<tag3 value="text" />

<tag4>innerText</tag4>

<tag5 value="text" value2="text">

inertia

</tag5>

 

<alert>Message from a1</alert>

<actor_tag>inner text</actor_tag>
The Problem: Only one match (the last tag) is found.

The same regular expression matches many tags in rubular; but only matches the last tag in a note section.
 

tyler.kendrick

Caffeine Addict
Veteran
Joined
Nov 21, 2014
Messages
52
Reaction score
14
First Language
English
Primarily Uses
I made a silly discovery.  Yes, the behavior is different between rubular's regexp engine and RMVXA's.  However, this seems to be because of the way the note section is parsed.

I believe this is because when the note is parsed, the line endings are converted to escape characters - meaning that "$" will prevent a match, and make the last tag valid (assuming it is not followed by a line-break or any other character).

Simply removing the "$" character from the regexp will allow RMVXA's engine to parse the note text uninterrupted.
 
Last edited by a moderator:

Tsukihime

Veteran
Veteran
Joined
Jun 30, 2012
Messages
8,564
Reaction score
3,877
First Language
English
No, line-endings are preserved as you would expect in a windows environment \r\n
 
Last edited by a moderator:

cremnophobia

Veteran
Veteran
Joined
Dec 10, 2013
Messages
225
Reaction score
100
Primarily Uses
I have to say I'd consider this a bug. I don't expect CRLF as newline. The source code of scripts also use them. Ace does the right thing by using UTF-8 where it matters, even though Windows uses UTF-16LE (and the legacy ANSI code pages). Why not also use only LF? That is far easier and faster than converting strings from/to UTF-8, and is just as sane.


At least the String#encode and the newline transcoders work in RGSS3.
 

Zeriab

Huggins!
Veteran
Joined
Mar 20, 2012
Messages
1,296
Reaction score
1,493
First Language
English
Primarily Uses
Other
With CR+LF being the Windows platform newline I would rather say not expected that as newline on a Windows program is rather a user error ;)

More generally I would say to expect CR?LF as a possible new line match. The truth is more complicated, but reasonable we will only encounter LF and CR+LF as new lines.

@tyler:

With the tags being numbers requiring start and end tags seems like a rather unnecessary condition. Do you really need to enforce such a constraint?

*hugs*

 - Zeriab
 

FenixFyreX

Fire Deity
Veteran
Joined
Mar 1, 2012
Messages
434
Reaction score
311
First Language
English
Primarily Uses
Further testing revealed my mistake in my response above; my apologies. I would recommend replacing $ with [\r\n]+, that works for me with the example you provided, from an event parsing actor 1's notebox.


[\r\n]+ matches both *nix and Windows line ending styles, so it's a general go-to anyways.
 
Last edited by a moderator:

Latest Threads

Latest Posts

Latest Profile Posts

Quite the versatile cast so far :p

chars.PNG

Edit: Sprites are made by Alexdraws and TheMightyPalm. I just edited them.
Degica Games Turn Komodo | RPG Maker News #77

Well, rats. Was really looking forward to trying out FPS Creator, but trying to install and set it up was pretty much impossible for my tiny brain to comprehend. So much for that, then.
Ah, home once more! I think I can safely work on my games now.
Let's hope power remains on for the day

Forum statistics

Threads
112,412
Messages
1,068,124
Members
146,063
Latest member
laserdolphSHCH
Top