Hello Patrick,
thanks for the READ() script function! This is sure a useful extension of the script
toolset.
However, I'd like to mention in this context that the engine for a long time contains
a mechanism for the general problem with large fields.
Some data (usually opaque binary data like the PHOTO or email attachments, but also
possibly very large text fields like NOTE) should be loaded on-demand only, and not with
with the syncset and the fields that are needed for ID and content matching.
So string fields can have a proxy object (a better term would probably be "data
provider"), which is called not before the contents of the field is actually needed -
usually when encoding for the remote peer. It is the "p" "mode" flag
in the datastore <map>s which controls the use of field proxies.
In the ODBC/SQL-backend, these proxies are configured with their own SQL statement which
loads the field's data. In the plugin backend, which is used in SyncEvolution, the
single-field-pull mechanism is mapped onto the ReadBLOB/WriteBLOB Api.
The proxy mechanism was even designed with the idea that really huge data should never be
loaded as a block, but only streamed through the engine. However, that was never
implemented on the encoding side, as the current SyncML item chunking mechanism is not
ready for streamed generation (total size must be known in advance).
But for a SQL based server like our IOT server, it already helps a lot if contact images
are NOT loaded as part of the syncset loading, but only when actually needed.
This is JFYI - for the problem at hand the READ() script solution is sure a clean and
efficient way to go.
Best Regards,
Lukas
On Jul 22, 2011, at 9:37 , Patrick Ohly wrote:
Hello!
I'm currently working on
https://bugs.meego.com/show_bug.cgi?id=19661:
like N900/Maemo 5 before, MeeGo apps prefer to store URIs of local photo
files in the PIM storage instead of storing the photo data in EDS
(better for performance).
When such a contact is synced to a peer, the photo data must be included
in the vCard. Ove solved that in his N900 port by inlining the data when
reading from EDS, before handing the data to SyncEvolution/Synthesis.
This has the downside that the data must be loaded in all cases,
including those where it is not needed (slow sync comparison of the
other properties in server mode) and remains in memory much longer.
I'd like to propose a more efficient solution that'll work with all
backends. Ove, if this goes in, then you might be able to remove the
special handling of photos in your port.
The idea is that a) the data model gets extended to allow both URI and
data in PHOTO and b) a file URI gets replaced by the actual file content
right before sending (but not sooner).
Lukas, can you review the libsynthesis changes? See below. I pushed to
the bmc19661 branch in
meego.gitorious.org. Some other fixes are also
included.
-----------------------
$ git log --reverse -p 92d2f367..bmc19661
commit 01c6ff4f7136d2c72b520818ee1ba89dc53c71f0
Author: Patrick Ohly <patrick.ohly(a)intel.com>
Date: Fri Jul 22 08:12:05 2011 +0200
SMLTK: fixed g++ 4.6 compiler warning
g++ 4.6 warns that the "rc" variable is getting assigned but never
used. smlEndEvaluation() must have been meant to return that error
code instead of always returning SML_ERR_OK - fixed.
diff --git a/src/syncml_tk/src/sml/mgr/all/mgrcmdbuilder.c
b/src/syncml_tk/src/sml/mgr/all/mgrcmdbuilder.c
index ae040a4..601b530 100755
--- a/src/syncml_tk/src/sml/mgr/all/mgrcmdbuilder.c
+++ b/src/syncml_tk/src/sml/mgr/all/mgrcmdbuilder.c
@@ -698,7 +698,7 @@ SML_API Ret_t smlEndEvaluation(InstanceID_t id, MemSize_t *freemem)
return SML_ERR_WRONG_USAGE;
rc = xltEndEvaluation(id, (XltEncoderPtr_t)(pInstanceInfo->encoderState), freemem);
- return SML_ERR_OK;
+ return rc;
}
#endif
commit 8d5cce896dcc5dba028d1cfa18f08e31adcc6e73
Author: Patrick Ohly <patrick.ohly(a)intel.com>
Date: Fri Jul 22 08:36:22 2011 +0200
"blob" fields: avoid binary encoding if possible
This change is meant for the PHOTO value, which can contain both
binary data and plain text URIs. Other binary data fields might also
benefit when their content turns out to be plain text (shorter
encoding).
The change is that base64 encoding is not enforced if all characters
are ASCII and printable. That allows special characters like colon,
comma, and semicolon to appear unchanged in the content.
Regardless whether the check succeeds, the result is guaranteed to
contain only ASCII characters, either because it only contains those
to start with or because of the base64 encoding.
diff --git a/src/sysync/mimedirprofile.cpp b/src/sysync/mimedirprofile.cpp
index 4105d03..1499876 100644
--- a/src/sysync/mimedirprofile.cpp
+++ b/src/sysync/mimedirprofile.cpp
@@ -23,6 +23,7 @@
#include "syncagent.h"
+#include <ctype.h>
using namespace sysync;
@@ -2274,8 +2275,18 @@ sInt16 TMimeDirProfileHandler::generateValue(
}
// append to existing string
fldP->appendToString(outval,maxSiz);
- // force B64 encoding
- aEncoding=enc_base64;
+ // force B64 encoding if non-printable or non-ASCII characters
+ // are in the value
+ size_t len = outval.size();
+ for (size_t i = 0; i < len; i++) {
+ char c = outval[i];
+ if (!isascii(c) || !isprint(c)) {
+ aEncoding=enc_base64;
+ break;
+ }
+ }
+ // only ASCII in value: either because it contains only
+ // those to start with or because they will be encoded
aNonASCII=false;
}
else {
commit b69d0aecf612d0f009903179619a983706f3b8f7
Author: Patrick Ohly <patrick.ohly(a)intel.com>
Date: Fri Jul 22 08:44:21 2011 +0200
script error messages: fixed invalid memory access
If the text goes through macro expansion, then "aScriptText" is not
the chunk of memory which holds the script and "text" doesn't point
into it anymore. Therefore "text-aScriptText" calculates the wrong
offset.
Fixed by storing the real start of memory in a different variable
and using that instead of aScriptText.
Found when enclosing a string with single quotes instead of double
quotes. The resulting syntax error message contained garbled
characters instead of the real script line.
diff --git a/src/sysync/scriptcontext.cpp b/src/sysync/scriptcontext.cpp
index 35dff88..f21641e 100755
--- a/src/sysync/scriptcontext.cpp
+++ b/src/sysync/scriptcontext.cpp
@@ -2464,6 +2464,7 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, cAppCharP
aScriptName, sI
text = itext.c_str();
}
// actual tokenisation
+ cAppCharP textstart = text;
SYSYNC_TRY {
// process text
while (*text) {
@@ -2540,7 +2541,7 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, cAppCharP
aScriptName, sI
else if (StrToEnum(ItemFieldTypeNames,numFieldTypes,enu,p,il)) {
// check if declaration and if allowed
if (aNoDeclarations && lasttoken!=TK_OPEN_PARANTHESIS)
- SYSYNC_THROW(TTokenizeException(aScriptName, "no local variable
declarations allowed in this script",aScriptText,text-aScriptText,line));
+ SYSYNC_THROW(TTokenizeException(aScriptName, "no local variable
declarations allowed in this script",textstart,text-textstart,line));
// code type into token
aTScript+=TK_TYPEDEF; // token
aTScript+=1; // length of additional data
@@ -2616,7 +2617,7 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, cAppCharP
aScriptName, sI
else if (strucmp(p,"WINNING",il)==0) objidx=OBJ_TARGET;
else if (strucmp(p,"TARGET",il)==0) objidx=OBJ_TARGET;
else
- SYSYNC_THROW(TTokenizeException(aScriptName,"unknown object
name",aScriptText,text-aScriptText,line));
+ SYSYNC_THROW(TTokenizeException(aScriptName,"unknown object
name",textstart,text-textstart,line));
text++; // skip object qualifier
aTScript+=TK_OBJECT; // token
aTScript+=1; // length of additional data
@@ -2641,13 +2642,13 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, cAppCharP
aScriptName, sI
p=text;
while (isidentchar(*text)) text++;
if (text==p)
- SYSYNC_THROW(TTokenizeException(aScriptName,"missing macro name after
$",aScriptText,text-aScriptText,line));
+ SYSYNC_THROW(TTokenizeException(aScriptName,"missing macro name after
$",textstart,text-textstart,line));
itm.assign(p,text-p);
// see if we have such a macro
TScriptConfig *cfgP = aAppBaseP->getRootConfig()->fScriptConfigP;
TStringToStringMap::iterator pos = cfgP->fScriptMacros.find(itm);
if (pos==cfgP->fScriptMacros.end())
- SYSYNC_THROW(TTokenizeException(aScriptName,"unknown
macro",aScriptText,p-1-aScriptText,line));
+ SYSYNC_THROW(TTokenizeException(aScriptName,"unknown
macro",textstart,p-1-textstart,line));
TMacroArgsArray macroArgs;
// check for macro arguments
if (*text=='(') {
@@ -2772,7 +2773,7 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, cAppCharP
aScriptName, sI
else token=TK_BITWISEOR; // |
break;
default:
- SYSYNC_THROW(TTokenizeException(aScriptName,"Syntax
Error",aScriptText,text-aScriptText,line));
+ SYSYNC_THROW(TTokenizeException(aScriptName,"Syntax
Error",textstart,text-textstart,line));
}
}
// add token if simple token found
commit e3fdd5ca811f24b2f80e598f9d00d2e134aa85e1
Author: Patrick Ohly <patrick.ohly(a)intel.com>
Date: Fri Jul 22 09:04:01 2011 +0200
scripting: added READ() method
The READ(filename) method returns the content of the file identified
with "filename". Relative paths are interpreted relative to the current
directory. On failures, an error messages is logged and UNASSIGNED
is returned.
This method is useful for inlining the photo data referenced with
local file:// URIs shortly before sending to a remote peer. SyncEvolution
uses the method in its outgoing vcard script as follows:
Field list:
<!-- Photo -->
<field name="PHOTO" type="blob" compare="never"
merge="fillempty"/>
<field name="PHOTO_TYPE" type="string"
compare="never" merge="fillempty"/>
<field name="PHOTO_VALUE" type="string"
compare="never" merge="fillempty"/>
Profile:
<property name="PHOTO" filter="no">
<value field="PHOTO" conversion="BLOB_B64"/>
<parameter name="TYPE" default="no"
show="yes">
<value field="PHOTO_TYPE"/>
</parameter>
<parameter name="VALUE" default="no"
show="yes">
<value field="PHOTO_VALUE"/>
</parameter>
</property>
Script:
if (PHOTO_VALUE == "uri" &&
SUBSTR(PHOTO, 0, 7) == "file://") {
// inline the photo data
string data;
data = READ(SUBSTR(PHOTO, 7));
if (data != UNASSIGNED) {
PHOTO = data;
PHOTO_VALUE = "binary";
}
}
Test cases for inlining, not inlining because of non-file URI and
failed inling (file not found) were added to SyncEvolution.
diff --git a/src/sysync/scriptcontext.cpp b/src/sysync/scriptcontext.cpp
index f21641e..e6124c9 100755
--- a/src/sysync/scriptcontext.cpp
+++ b/src/sysync/scriptcontext.cpp
@@ -27,6 +27,7 @@
#include "pcre.h" // for RegEx functions
#endif
+#include <stdio.h>
// script debug messages
#ifdef SYDEBUG
@@ -869,6 +870,55 @@ public:
aTermP->setAsInteger(exitcode);
}; // func_Shellexecute
+ // string READ(string file)
+ // reads the file and returns its content or UNASSIGNED in case of failure;
+ // errors are logged
+ static void func_Read(TItemField *&aTermP, TScriptContext *aFuncContextP)
+ {
+ // get params
+ string file;
+ aFuncContextP->getLocalVar(0)->getAsString(file);
+
+ // execute now
+ string content;
+ FILE *in;
+ in = fopen(file.c_str(), "rb");
+ if (in) {
+ long size = fseek(in, 0, SEEK_END);
+ if (size >= 0) {
+ // managed to obtain size, use it to pre-allocate result
+ content.reserve(size);
+ fseek(in, 0, SEEK_SET);
+ } else {
+ // ignore seek error, might not be a plain file
+ clearerr(in);
+ }
+
+ if (!ferror(in)) {
+ char buf[8192];
+ size_t read;
+ while ((read = fread(buf, 1, sizeof(buf), in)) > 0) {
+ content.append(buf, read);
+ }
+ }
+ }
+
+ if (in && !ferror(in)) {
+ // return content as string
+ aTermP->setAsString(content);
+ } else {
+ PLOGDEBUGPRINTFX(aFuncContextP->getDbgLogger(),
+ DBG_ERROR,(
+ "IO error in READ(\"%s\"): %s ",
+ file.c_str(),
+ strerror(errno)));
+ }
+
+ if (in) {
+ fclose(in);
+ }
+ } // func_Read
+
// string REMOTERULENAME()
// returns name of the LAST matched remote rule (or subrule), empty if none
@@ -2220,6 +2270,7 @@ const TBuiltInFuncDef BuiltInFuncDefs[] = {
{ "REQUESTMAXTIME", TBuiltinStdFuncs::func_RequestMaxTime, fty_none, 1,
param_oneInteger },
{ "REQUESTMINTIME", TBuiltinStdFuncs::func_RequestMinTime, fty_none, 1,
param_oneInteger },
{ "SHELLEXECUTE", TBuiltinStdFuncs::func_Shellexecute, fty_integer, 3,
param_Shellexecute },
+ { "READ", TBuiltinStdFuncs::func_Read, fty_string, 1, param_oneString },
{ "SESSIONVAR", TBuiltinStdFuncs::func_SessionVar, fty_none, 1,
param_oneString },
{ "SETSESSIONVAR", TBuiltinStdFuncs::func_SetSessionVar, fty_none, 2,
param_SetSessionVar },
{ "ABORTSESSION", TBuiltinStdFuncs::func_AbortSession, fty_none, 1,
param_oneInteger },
-----------------------