Mixed case data

Hi - I’m looking for some input regarding mixed case data. Many of our varchar columns in our Oracle databases contain mixed case data. That presents a problem to my BO users when they try to build conditions on their reports. If they type in the case incorrectly, their query won’t return the right results.

One solution that I’ve thought of, is to prefix each column’s select statement with the UPPER command in Designer. Then, the BO users would always have to type in upper case when building conditions & prompts. However, does the UPPER command contribute to poor performance of the query?

Another solution would be to have the DBA convert those cols to UPPER when he’s populating the data mart. However, then the data is physically converted to upper case permanently.

Anyone have an opinions about this? Thanks!

Lisa Hale
Steelcase, Inc.
616-246-9273


Listserv Archives (BOB member since 2002-06-25)

Lisa,

You can also try setting up Lists of Values for all ‘condition’ type objects. Using the @prompt in the condition object, you can force the user to only be able to select from an LOV (i.e. cannot type in directly)…thereby ensuring that they never enter an invalid value. These values can be dynamically regenerated each time the user selects from the list, or stored permenantly as part of the universe (although this can make the universe somewhat large).

And yes, using any oracle function on an indexed column will result in the index not being used, thereby contributing to poor performance.

Jason Beard
AGD Computer Services


Listserv Archives (BOB member since 2002-06-25)

Hi Lisa,

I’m new to this. I’ll take a chance at answering your question. If your condition is on a column that is indexed, and you put a function on that column (like upper), Oracle will not use the index. So that will cause performance degradation. In applications I have worked on I generally leave most character columns as uppercase in the database and use the initcap function to display the data in mixed case.

I hope this helps.

Marian Cooney
HBO & Company
610-296-7658 (x1108)


Listserv Archives (BOB member since 2002-06-25)

Try selecting the data from the List of Values instead of typing it in. You avoid typos and also let the actual database contents determine the value specified in the condition. I’ve had good luck with this method.

George Baranowski/QuadraMed

Hi - I’m looking for some input regarding mixed case data. Many of our varchar columns in our Oracle databases contain mixed case data. That presents a problem to my BO users when they try to build conditions on
their
reports. If they type in the case incorrectly, their query won’t return
the
right results…

Lisa Hale
Steelcase, Inc.
616-246-9273


Listserv Archives (BOB member since 2002-06-25)

In a message dated 98-11-17 09:50:56 EST, you write:

Hi - I’m looking for some input regarding mixed case data. Many of our
varchar columns in our Oracle databases contain mixed case data. That presents a problem to my BO users when they try to build conditions on
their
reports. If they type in the case incorrectly, their query won’t return
the
right results.

One solution that I’ve thought of, is to prefix each column’s select statement
with the UPPER command in Designer. Then, the BO users would always have
to
type in upper case when building conditions & prompts. However, does the UPPER command contribute to poor performance of the query?

Yes, it would! If you end up with a condition that is JUST the following:

upper(table.column) = upper(user response)

… then the database optimizer will not be able to use any indexes on the table.column entry. But read on…

Another solution would be to have the DBA convert those cols to UPPER when he’s populating the data mart. However, then the data is physically converted
to upper case permanently.

Yes, that is also true.

Following is a response I sent to the list some time ago that provides a solution to this problem. I kept a copy because this subject turns up fairly often, and I didn’t want to have to type it all in again. :slight_smile:

Regards,
Dave Rathbun
Integra Solutions
www.islink.com

***** Originally posted on 9/3/98 *******

We need to search text fields containing mixed upper and lower case long
descriptive information. The database entries are also inconsistent so there is no guarantee that upper case letters appear only at the beginning of a word.

We would like to be able to do a non case-sensitive search effectively BUT at the same time retain how the descriptions were originally keyed in for our reports.

Apart from specifying all possible combinations of upper and lower case for the search, is there anything that will work like an anycase ?

For this example I will assume that the field being searched is a customer name.

There is an interesting trick to this. You could, for example, create an object that uses your database UPPER() command (if available) to force all of the data in the table to be upper case. That would allow your users to search with confidence, knowing that all of the search conditions should be in upper case. This object could be called Search Name. It would be valid as a condition, but not as a result object. You can control these settings when the object is set up in the Designer application.

Next, create an object called Customer Name that can be a result object but cannot be a search object. This object would not use the UPPER() function, and would be used for display purposes on the report.

This should work, but will have a side effect of ruining any chance of using a database index on the customer name column. (If the field in question is not indexed, then this is not a concern.) If the field is indexed,however, then you will lose that performance benefit. To solve that problem, you should go another way.

Instead of using just a search based on the UPPER version of the customer name, it turns out that the following set of conditions are often enough to use the index and search for matches based on the upper case version. If you are searching for a customer named “Brown” then it goes something like this:

(customer_name like ‘Br%’ OR
customer_name like ‘br%’ OR
customer_name like ‘bR%’ OR
customer_name like ‘BR%’) AND
upper(customer_name) = ‘BROWN’

It turns out that specifying all possible case options for the first two letters is enough to narrow down the search using an index, while the last item checks for the exact match. This is, however, not very user friendly! You would not want to train your users to build conditions in this way!

So, the final solution: build an object that picks apart the user’s criteria and builds the condition for them. You did not specify which database you are using, so I’ll provide an example using Oracle. Because it’s the easiest.

The predefined condition would be called ‘Customer Name Search’ and the where clause could be something like:

(
customers.customer_name like
upper(substr(@prompt(‘Enter customer name’,‘A’,mono,free),1,1) || lower(substr(@prompt(‘Enter customer name’,‘A’,mono,free),2,1) || ‘%’ OR customers.customer_name like
lower(substr(@prompt(‘Enter customer name’,‘A’,mono,free),1,1) || lower(substr(@prompt(‘Enter customer name’,‘A’,mono,free),2,1) || ‘%’ OR customers.customer_name like
lower(substr(@prompt(‘Enter customer name’,‘A’,mono,free),1,1) || upper(substr(@prompt(‘Enter customer name’,‘A’,mono,free),2,1) || ‘%’ OR customers.customer_name like
upper(substr(@prompt(‘Enter customer name’,‘A’,mono,free),1,1) || upper(substr(@prompt(‘Enter customer name’,‘A’,mono,free),2,1) || ‘%’ ) AND upper(customers.customer_name) =
upper(@prompt(‘Enter customer name’,‘A’,mono,free))

In this example, I am not using a list of values for customer names. It probably is not necessary, as the user is simply expected to type in the name. If you are going to use the list of values feature, then this trick is not required as you will have an exact match. Note: be very careful with the parenthesis! You have to have the four “OR” conditions inside parenthesis, and the AND condition outside.

(test 1 OR test 2 OR test 3 OR test 4) AND test 5

Recall that if you use the same prompt over and over, then BusObj will only prompt the user one time, ensuring that the same value is used throughout the logic for the entire condition. As I said, this object should work for Oracle; as long as your database has an equivalend Sub String, Upper, and Lower function then you should be able to convert this idea to work.

Hope this helps!

Regards,
Dave Rathbun
Integra Solutions
www.islink.com


Listserv Archives (BOB member since 2002-06-25)

OK, I have to ask this question:
Does it mean, that you CAN have a list of values to show in prompt built on some database column (as opposed to list in @Prompt command)?

Very curious,
Ryszard Mikke

–==> Hiroshima’45 Tschernobyl’86 Windows’95 <==-- R.Mikke@pl.vwfsag.de

From: George Baranowski[SMTP:gbaranowski@QUADRAMED.COM]
Sent: 17 listopada 1998 16:38

Try selecting the data from the List of Values instead of typing it in. You
avoid typos and also let the actual database contents determine the value specified in the condition. I’ve had good luck with this method.

George Baranowski/QuadraMed

Hi - I’m looking for some input regarding mixed case data. Many of our varchar columns in our Oracle databases contain mixed case data. That presents a problem to my BO users when they try to build conditions on
their
reports. If they type in the case incorrectly, their query won’t return
the
right results…

Lisa Hale
Steelcase, Inc.
616-246-9273

Pls report bounces in response to postings to BUSOB-L-Request@listserv.aol.com
Web archives (9am-5pm ET only): listserv.aol.com/archives/busob-l.html
OR search: Mail to listserv@listserv.aol.com, ‘search a_phrase in BUSOB-L’
Unsubscribe: Mail to listserv@listserv.aol.com, ‘unsubscribe BUSOB-L’


Listserv Archives (BOB member since 2002-06-25)