Legacy Cyrillic in Hash Tagging

Status
Not open for further replies.

motd2

Customer
Cyrillic in Hash Tagging - resolved
dbtech/usertag/includes/class_core.php
line 915
find
Code:
 if (preg_match('/[^\w-]/i', $letter) OR in_array($letter, array(';', '', ' ', "\r", "\n", "\t", "\s", ".", ","))) // in_array($letter, array(';', '', ' ', "\r", "\n", "\t", "\s", ".", ","))
replace
Code:
 if (preg_match('/[^\w-]/iu', $letter) OR in_array($letter, array(';', '', ' ', "\r", "\n", "\t", "\s", ".", ","))) // in_array($letter, array(';', '', ' ', "\r", "\n", "\t", "\s", ".", ","))

test on vB3, vB4 (utf-8, windows-1251) English and Cyrillic.
Probably it will work with any encoding.
 
Upvote 0
This suggestion has been closed. Votes are no longer accepted.
motd2

Update: If possible, I'd like you to beta test a change for me that may improve hash tagging altogether.

In /dbtech/usertag/includes/class_core.php find:
PHP:
			case 'hash':
				$max = 100;
				$min = 1;

				$post = preg_replace('/&#\d{3,5};/', '', $post);
				while (($at = strpos($post, '#', $at)) !== false)
				{
					$at++;
					$username = '';

					if ((in_array(substr($post, $at - 2, 1), array(' ', "\r", "\n", "\t", "\s", ".", ",")) !== false) OR (($at - 2) < 0))
					{
						for ($x = 0; $x <= $max; $x += strlen($letter))
						{
							$letter = $post[$offset = $at + $x];
							if (preg_match('/[^\w-]/iu', $letter) OR in_array($letter, array(';', '', ' ', "\r", "\n", "\t", "\s", ".", ","))) // in_array($letter, array(';', '', ' ', "\r", "\n", "\t", "\s", ".", ","))
							{
								$matches[] = $username;
								break;
							}

							if ($letter == '&' AND preg_match('/&#\d{3,5};/', $post, $multi, NULL, $offset))
							{
								$letter = $multi[0];
							}

							$username .= $letter;
						}
					}
				}
				break;

Replace with
PHP:
			case 'hash':
				$post = preg_split('/[\s,]+/u', $post);
				foreach ($post as $word)
				{
					if ($word{0} == '#')
					{
						// This was a match
						$matches[] = substr($word, 1);
					}
				}
				break;

I have successfully tested this with the word посещения on my local test installation (ISO-8859-1 in the Language settings, latin1_swedish_ci in the database).

Could you please test this out with your database schema, and just generally try to break it? :p

I'll put this change live in an update on Monday :)
 
Fillip H.

if you use one word посещения - it works

if you use the word and text
#посещения
here text message
more text

all the words are combined into one hashtag and notification does not work

7405190.png
 
Status
Not open for further replies.
Back
Top