Uploaded image for project: 'Modules'
  1. Modules
  2. MOD-153

Incompatible indexes between OSMHB, LXX, Byz, TR, WHNU, KJV, NASB

    Details

      Description

      See http://www.crosswire.org/bugs/browse/SWEB-14
      I am not sure whether this bug affects all sword software or only the web interface, but I believe the indices are intrinsic to the modules so I am guessing that the only solution is to amend the modules, right?

        Attachments

          Activity

          Hide
          chrislit Chris Little added a comment -

          Report basically pertains to Strong's numbers (not indexes) not being correctly normalized in software. We could do some limited normalization within the texts, but it's not a priority, and given that other normalization must be performed in software, it might as well all be performed in software for the time being. I'm fairly sure the normalization functions are already in Sword, so it's possible swordweb is simply not using them.

          Show
          chrislit Chris Little added a comment - Report basically pertains to Strong's numbers (not indexes) not being correctly normalized in software. We could do some limited normalization within the texts, but it's not a priority, and given that other normalization must be performed in software, it might as well all be performed in software for the time being. I'm fairly sure the normalization functions are already in Sword, so it's possible swordweb is simply not using them.
          Hide
          lwq David Lim added a comment -

          Hmm I agree that some parsing can be done by the software using the modules, but what about those with multiple numbers or extra characters, like "3588|3056", "3761|1520", "0853|01254" and "1254!a"? In NASB, searching for "1254" or "1254!b" gives the same list while searching for "1254!a" gives a subset. I do not see how it is possible to know that the correct numbers are "3056", "1520", "01254" and "1254" without either knowing what the other numbers stand for or knowing the format used in the module, that is why I thought that perhaps it would be much easier to normalize the modules themselves. I would be glad to do that if I can be given the uncompressed text files.

          Show
          lwq David Lim added a comment - Hmm I agree that some parsing can be done by the software using the modules, but what about those with multiple numbers or extra characters, like "3588|3056", "3761|1520", "0853|01254" and "1254!a"? In NASB, searching for "1254" or "1254!b" gives the same list while searching for "1254!a" gives a subset. I do not see how it is possible to know that the correct numbers are "3056", "1520", "01254" and "1254" without either knowing what the other numbers stand for or knowing the format used in the module, that is why I thought that perhaps it would be much easier to normalize the modules themselves. I would be glad to do that if I can be given the uncompressed text files.
          Hide
          dmsmith DM Smith added a comment -

          The different forms:
          Gdddd - This is the root form. It can begin with a G, g, H or h and be followed by up to 5 digits. It may have leading 0's to pad it to the same width.
          Gdddd|Gdddd - This indicates that multiple Greek (or Hebrew) are translated into one word. Typically, the first is G3588, the indefinite article.
          Gdddd!a - This is used by the NASB to indicate that Gdddd is split into multiple definitions. In these cases, a search that does not find the entry with !a is to look for it without and return that. Note: in the case of a !a there is a !b and may be more. Each of these words are related but represent a disagreement with (or enhancement to) the Strong's original work.

          The SWORD (or JSword) engine has the responsibility to normalize these upon a module lookup to a leading letter, a 5 digit padded number and if !a is present, then the 'a' appended to the number.

          The front-ends have the responsibility to split the Gdddd|Gdddd form and give each to the SWORD (JSword) engine for processing. Very few front-ends do this well. I'm not at all sure what SwordWEB should do with its multiple highlighting.

          I concur with Chris that this is not a module problem. At best it is a front-end problem. Perhaps you should move this to SwordWEB as a bug?

          Show
          dmsmith DM Smith added a comment - The different forms: Gdddd - This is the root form. It can begin with a G, g, H or h and be followed by up to 5 digits. It may have leading 0's to pad it to the same width. Gdddd|Gdddd - This indicates that multiple Greek (or Hebrew) are translated into one word. Typically, the first is G3588, the indefinite article. Gdddd!a - This is used by the NASB to indicate that Gdddd is split into multiple definitions. In these cases, a search that does not find the entry with !a is to look for it without and return that. Note: in the case of a !a there is a !b and may be more. Each of these words are related but represent a disagreement with (or enhancement to) the Strong's original work. The SWORD (or JSword) engine has the responsibility to normalize these upon a module lookup to a leading letter, a 5 digit padded number and if !a is present, then the 'a' appended to the number. The front-ends have the responsibility to split the Gdddd|Gdddd form and give each to the SWORD (JSword) engine for processing. Very few front-ends do this well. I'm not at all sure what SwordWEB should do with its multiple highlighting. I concur with Chris that this is not a module problem. At best it is a front-end problem. Perhaps you should move this to SwordWEB as a bug?

            People

            • Assignee:
              chrislit Chris Little
              Reporter:
              lwq David Lim
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: