Page refuses to index

If you have questions about using mojoPortal, you can post them here.

You may want to first review our site administration documentation to see if your question is answered there.

This thread is closed to new posts. You must sign in to post in the forums.
9/28/2013 3:02:43 AM
Gravatar
Total Posts 537
feet planted firmly on the ground

Page refuses to index

I have a page that simply cannot be searched for, and I cannot work out why. I have rebuilt the index completely, and other pages are found with the search just fine. If I create new pages/content, these are found by the search within a couple of minutes.

If I copy and paste the content from the "bad" page and put it on another page, the new page gets found. But the original page is apparently invisible. No errors are reported in the system log. Also if I publish the html content instance on another page, it does get found there either after editing. So the problem is something to do with the page, not the content instance.

We have Content Workflow switched on, and this content is fully published.

If it's any help, the bad page is http://www.esdm.co.uk/lidar-processing

the page where it can be found (e.g. search on "steepness") is http://www.esdm.co.uk/lidar-data-processing

Any ideas?

In this case I'll probably end up using the alternative page and bin the original, but I'd like to know what is going on here as it may mean lots of other content not being indexed.

mojoPortal Version 2.3.9.8 MSSQL
Operating System Microsoft Windows NT 6.0.6002 Service Pack 2
ASP.NET Info v4.0.30319 Running in Full Trust
 

 

9/28/2013 2:52:16 PM
Gravatar
Total Posts 18439

Re: Page refuses to index

I'm not sure. We have a utility page for browsing the search index, you can drop it into your site and maybe find clues.

I would try editing it from the page where it is not showing up, just go in as an editor and save and then wait 10 minutes and see if it then shows up in the search index.

It looks like the same content instance on both pages so I guess it is not marked as "exclude from search index" in module settings.

9/28/2013 4:18:57 PM
Gravatar
Total Posts 537
feet planted firmly on the ground

Re: Page refuses to index

Have already tried everything like that... page no show!   Will now try the utility page - thanks.

9/28/2013 4:33:13 PM
Gravatar
Total Posts 537
feet planted firmly on the ground

Re: Page refuses to index

OK the index browser blows up on this site (works on others), with this below in the system log. Does this tell us anything about the problem? (I have already completely rebuilt the index on the site).

2013-09-28 22:30:15,377 ERROR 192.168.54.4 - en-GB - /indexbrowser.aspx - mojoPortal.Web.Global -  Referrer(none) useragent Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/6.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0E; .NET4.0C; BRI/2; InfoPath.3; BOIE9;ENUS)
System.NullReferenceException: Object reference not set to an instance of an object.
   at IndexBrowser.aspx.__DataBinding__control11(Object sender, EventArgs e)
   at System.Web.UI.Control.OnDataBinding(EventArgs e)
   at System.Web.UI.Control.DataBind(Boolean raiseOnDataBinding)
   at System.Web.UI.Control.DataBind()
   at System.Web.UI.Control.DataBindChildren()
   at System.Web.UI.Control.DataBind(Boolean raiseOnDataBinding)
   at System.Web.UI.Control.DataBind()
   at System.Web.UI.WebControls.Repeater.CreateItem(Int32 itemIndex, ListItemType itemType, Boolean dataBind, Object dataItem)
   at System.Web.UI.WebControls.Repeater.CreateControlHierarchy(Boolean useDataSource)
   at System.Web.UI.WebControls.Repeater.OnDataBinding(EventArgs e)
   at System.Web.UI.WebControls.Repeater.DataBind()
   at IndexBrowser.aspx.BindIndex()
   at IndexBrowser.aspx.Page_Load(Object sender, EventArgs e)
   at System.Web.Util.CalliEventHandlerDelegateProxy.Callback(Object sender, EventArgs e)
   at System.Web.UI.Control.OnLoad(EventArgs e)
   at System.Web.UI.Control.LoadRecursive()
   at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)
 

9/29/2013 11:11:59 AM
Gravatar
Total Posts 18439

Re: Page refuses to index

That error would be expected if you try to use the utility with older versions of mojoPortal. The error is because the search index does not have one or more expected fields that should exist in the latest version.

If you are sure you are using mojoPortal 2.3.9.8 and getting this error then look in your user.config file for any settings related to the search index and post them here so I can see them. Possibly you have some legacy settings that need to be changed or removed.

9/29/2013 3:17:44 PM
Gravatar
Total Posts 537
feet planted firmly on the ground

Re: Page refuses to index

Hi Joe, I have these settings:


  <add key="ShowRebuildSearchIndexButtonToAdmins" value="false" />
  <add key="DisableSearchFeatureFilters" value="false" />
  <add key="SearchUseBackwardCompatibilityMode" value="false" />
  <add key="EnableSearchResultsHighlighting" value="true" />
  <add key="SearchIncludeModuleRoleFilters" value="true" />

  <add key="UseLegacyCryptoHelper" value="false" />

9/30/2013 5:18:42 AM
Gravatar
Total Posts 537
feet planted firmly on the ground

Re: Page refuses to index

Hi Joe

I think I've sorted it... (the original problem, not the problem with the index browser); the page was in "Draft" (in page settings). After changing this setting and editing the content, it is found in the search.

I'll confess I didn't know the page was in draft, because I had already double-checked that it was visible to anonymous web users.

I wonder whether this counts as a little bug, or at least a design quirk?  The page permissions in this case allowed it to be viewed by anyone, while the draft status serves to remove it from the menu and site map. In that light I can see why you might also suppress it from the search. But it also doesn't appear in the search results even for users with edit permissions (hence my OP).

I'm sure there are reasons why it is so, but the behaviour I would have expected would be: draft status to hide the page completely, except for users with edit/approval rights. But include the page in the search results for users with those rights.

10/1/2013 3:09:31 PM
Gravatar
Total Posts 18439

Re: Page refuses to index

Hi Crispin,

I think the correct behavior is not to index content on draft pages.

Because draft status is not a security feature and the page is only protected by view roles like any other page, and because I think we don't want people to easily find our draft content even if we have not locked it down until it is no longer a draft.  Often the draft is going to be a public page so it is not protected by view roles and if someone knew the url they would see the content because it is not protected by roles  it is just hidden by not being in the menu. This saves the hassle for pages that will be public so you don't have to come back in and actually publish it by role permission for All Users. 

The search index doesn't know about edit roles, it only knows about view roles and filters search results by view roles that protect the indexed content. So if we did index content on draft pages it would show up in the search results regardless of page draft status which it knows nothing about. Thus there would be a good chance for accidental leaking of content that wasn't meant to be visible because it is a draft.

Also the purpose of the page draft status is for when you have draft instances of html content feature on the apge that you are working on before making the page visible. Html content feature also does not index draft versions of a content instance so it is not expected that the draft page needs to be indexed until the time when you uncheck draft and save it. The page itself is just a container it has no intrinsic content to be indexed other than that of feature instances that support search. 

Best,

Joe

10/1/2013 3:48:16 PM
Gravatar
Total Posts 537
feet planted firmly on the ground

Re: Page refuses to index

Thanks for the explanation Joe - I can see the logic if the search index doesn't know about edit roles, and will just have to remember not to try to find draft pages with the search!  And, having done a quick test, I'm reassured that when we switch a page from draft to non-draft, the page content is indexed - nice. 

10/1/2013 4:58:39 PM
Gravatar
Total Posts 537
feet planted firmly on the ground

Re: Page refuses to index

PS maybe the documentation about Site Search could note that pages/features in draft will not be indexed.

I always read the manuals before posting;-)

 

You must sign in to post in the forums. This thread is closed to new posts.