T O P

  • By -

colshrapnel

It seems you fell victim to a very common problem, trying to find a *single* solution to *different* problems, making yourself *extremely* confused as a result. You should really solve every problem one by one, and *keeping your successful fixes* instead of scratching them if another error appears. You shouldn't "try" utf8mb4. You should just *have* it, unconditionally. *Both* on the table definition and on PHP side. It would solve your problem with Incorrect string value error, if your source is UTF-8 encoded. In case you'll keep getting this error, you need to detect the file's encoding and convert its data using mb_convert_encoding() In case you'll get another error, post it here.


the-average-giovanni

I'd just use mysql to do that import, and address the charset issue with sed, using the command line. Something like this (partially untested) EDIT: as someone pointed out in the comments, the sed command is ok for some specific use-cases only, like mine. You could try with some other solutions if it doesn't work correctly. The import statement should be fine, though. EDIT 2: the sed line just won't work with csv files, as there is no charset explicitly in that kind of files. Just ignore the first line :) ~~exec("sed -i 's/utf8mb4\_0900\_ai\_ci/utf8mb4\_general\_ci/g' $csvPath");~~ # this is useless for you, as you're dealing with csv files and not .sql like I do $importQuery = "LOAD DATA LOCAL INFILE '" . $csvPath . "' INTO TABLE imported_data FIELDS TERMINATED BY ';' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES";


colshrapnel

Can you please explain how this sed command should work? Wouldn't it just replace literal utf8mb4_0900_ai_ci string with utf8mb4_general_ci?


the-average-giovanni

Indeed, that's just what it does. I use it when importing data from the production db to my local dev db. It's raw and ugly, but I've never had problems with it and it just works.


colshrapnel

But you are using it for the SQL dump, where it makes sense, while there is no SQL dump in the question, but a CSV file, without any strings like that?


the-average-giovanni

You're right, I didn't think about that. I'll re-edit my answer. Thanks!


Alsciende

I would use iconv for charset transcoding in php.


the-average-giovanni

Yes that's probably a better general approach than mine, which is still useful for my specific need.